Cargando…

PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data

MOTIVATION: Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other...

Descripción completa

Detalles Bibliográficos
Autores principales: Peabody, Michael A, Lau, Wing Yin Venus, Hoad, Gemma R, Jia, Baofeng, Maguire, Finlay, Gray, Kristen L, Beiko, Robert G, Brinkman, Fiona S L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7214030/
https://www.ncbi.nlm.nih.gov/pubmed/32108861
http://dx.doi.org/10.1093/bioinformatics/btaa136
_version_ 1783531898842120192
author Peabody, Michael A
Lau, Wing Yin Venus
Hoad, Gemma R
Jia, Baofeng
Maguire, Finlay
Gray, Kristen L
Beiko, Robert G
Brinkman, Fiona S L
author_facet Peabody, Michael A
Lau, Wing Yin Venus
Hoad, Gemma R
Jia, Baofeng
Maguire, Finlay
Gray, Kristen L
Beiko, Robert G
Brinkman, Fiona S L
author_sort Peabody, Michael A
collection PubMed
description MOTIVATION: Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other environments (e.g. for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine versus pollution-impacted watersheds. RESULTS: We report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross-validation with in silico-fragmented sequences with known localization showed that PSORTm maintains PSORTb’s high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm’s read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs); however, the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental or industrial importance. AVAILABILITY AND IMPLEMENTATION: Documentation, source code and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open-source software under GNU General Public License Version 3). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7214030
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72140302020-05-15 PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data Peabody, Michael A Lau, Wing Yin Venus Hoad, Gemma R Jia, Baofeng Maguire, Finlay Gray, Kristen L Beiko, Robert G Brinkman, Fiona S L Bioinformatics Original Papers MOTIVATION: Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other environments (e.g. for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine versus pollution-impacted watersheds. RESULTS: We report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross-validation with in silico-fragmented sequences with known localization showed that PSORTm maintains PSORTb’s high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm’s read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs); however, the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental or industrial importance. AVAILABILITY AND IMPLEMENTATION: Documentation, source code and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open-source software under GNU General Public License Version 3). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-05-15 2020-02-28 /pmc/articles/PMC7214030/ /pubmed/32108861 http://dx.doi.org/10.1093/bioinformatics/btaa136 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Peabody, Michael A
Lau, Wing Yin Venus
Hoad, Gemma R
Jia, Baofeng
Maguire, Finlay
Gray, Kristen L
Beiko, Robert G
Brinkman, Fiona S L
PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title_full PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title_fullStr PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title_full_unstemmed PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title_short PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
title_sort psortm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7214030/
https://www.ncbi.nlm.nih.gov/pubmed/32108861
http://dx.doi.org/10.1093/bioinformatics/btaa136
work_keys_str_mv AT peabodymichaela psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT lauwingyinvenus psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT hoadgemmar psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT jiabaofeng psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT maguirefinlay psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT graykristenl psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT beikorobertg psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata
AT brinkmanfionasl psortmabacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata