Cargando…

Deriving enzymatic and taxonomic signatures of metagenomes from short read data

BACKGROUND: We propose a method for deriving enzymatic signatures from short read metagenomic data of unknown species. The short read data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzym...

Descripción completa

Detalles Bibliográficos
Autores principales: Weingart, Uri, Persi, Erez, Gophna, Uri, Horn, David
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2922197/
https://www.ncbi.nlm.nih.gov/pubmed/20649951
http://dx.doi.org/10.1186/1471-2105-11-390
_version_ 1782185421178929152
author Weingart, Uri
Persi, Erez
Gophna, Uri
Horn, David
author_facet Weingart, Uri
Persi, Erez
Gophna, Uri
Horn, David
author_sort Weingart, Uri
collection PubMed
description BACKGROUND: We propose a method for deriving enzymatic signatures from short read metagenomic data of unknown species. The short read data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzymatic function as defined by the Enzyme Commission (EC) nomenclature. The number of SP hits on an ensemble of short reads is counted and then converted to estimates of numbers of enzymatic genes associated with different EC categories in the studied metagenome. Relative amounts of different EC categories define the enzymatic spectrum, without the need to perform genomic assemblies of short reads. RESULTS: The method is developed and tested on 22 bacteria for which there exist many EC annotations in Uniprot. Enzymatic signatures are derived for 3 metagenomes, and their functional profiles are explored. We extend the SP methodology to taxon-specific SPs (TSPs), allowing us to estimate taxonomic features of metagenomic data from short reads. Using recent Swiss-Prot data we obtain TSPs for different phyla of bacteria, and different classes of proteobacteria. These allow us to analyze the major taxonomic content of 4 different metagenomic data-sets. CONCLUSIONS: The SP methodology can be successfully extended to applications on short read genomic and metagenomic data. This leads to direct derivation of enzymatic signatures from raw short reads. Furthermore, by employing TSPs, one obtains valuable taxonomic information.
format Text
id pubmed-2922197
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29221972010-08-17 Deriving enzymatic and taxonomic signatures of metagenomes from short read data Weingart, Uri Persi, Erez Gophna, Uri Horn, David BMC Bioinformatics Methodology Article BACKGROUND: We propose a method for deriving enzymatic signatures from short read metagenomic data of unknown species. The short read data are converted to six pseudo-peptide candidates. We search for occurrences of Specific Peptides (SPs) on the latter. SPs are peptides that are indicative of enzymatic function as defined by the Enzyme Commission (EC) nomenclature. The number of SP hits on an ensemble of short reads is counted and then converted to estimates of numbers of enzymatic genes associated with different EC categories in the studied metagenome. Relative amounts of different EC categories define the enzymatic spectrum, without the need to perform genomic assemblies of short reads. RESULTS: The method is developed and tested on 22 bacteria for which there exist many EC annotations in Uniprot. Enzymatic signatures are derived for 3 metagenomes, and their functional profiles are explored. We extend the SP methodology to taxon-specific SPs (TSPs), allowing us to estimate taxonomic features of metagenomic data from short reads. Using recent Swiss-Prot data we obtain TSPs for different phyla of bacteria, and different classes of proteobacteria. These allow us to analyze the major taxonomic content of 4 different metagenomic data-sets. CONCLUSIONS: The SP methodology can be successfully extended to applications on short read genomic and metagenomic data. This leads to direct derivation of enzymatic signatures from raw short reads. Furthermore, by employing TSPs, one obtains valuable taxonomic information. BioMed Central 2010-07-22 /pmc/articles/PMC2922197/ /pubmed/20649951 http://dx.doi.org/10.1186/1471-2105-11-390 Text en Copyright ©2010 Weingart et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Weingart, Uri
Persi, Erez
Gophna, Uri
Horn, David
Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title_full Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title_fullStr Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title_full_unstemmed Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title_short Deriving enzymatic and taxonomic signatures of metagenomes from short read data
title_sort deriving enzymatic and taxonomic signatures of metagenomes from short read data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2922197/
https://www.ncbi.nlm.nih.gov/pubmed/20649951
http://dx.doi.org/10.1186/1471-2105-11-390
work_keys_str_mv AT weingarturi derivingenzymaticandtaxonomicsignaturesofmetagenomesfromshortreaddata
AT persierez derivingenzymaticandtaxonomicsignaturesofmetagenomesfromshortreaddata
AT gophnauri derivingenzymaticandtaxonomicsignaturesofmetagenomesfromshortreaddata
AT horndavid derivingenzymaticandtaxonomicsignaturesofmetagenomesfromshortreaddata