Cargando…

A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment

BACKGROUND: Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD) were found in cyanophage genome...

Descripción completa

Detalles Bibliográficos
Autores principales: Tzahor, Shani, Man-Aharonovich, Dikla, Kirkup, Benjamin C, Yogev, Tali, Berman-Frank, Ilana, Polz, Martin F, Béjà, Oded, Mandel-Gutfreund, Yael
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696472/
https://www.ncbi.nlm.nih.gov/pubmed/19445709
http://dx.doi.org/10.1186/1471-2164-10-229
_version_ 1782168272349691904
author Tzahor, Shani
Man-Aharonovich, Dikla
Kirkup, Benjamin C
Yogev, Tali
Berman-Frank, Ilana
Polz, Martin F
Béjà, Oded
Mandel-Gutfreund, Yael
author_facet Tzahor, Shani
Man-Aharonovich, Dikla
Kirkup, Benjamin C
Yogev, Tali
Berman-Frank, Ilana
Polz, Martin F
Béjà, Oded
Mandel-Gutfreund, Yael
author_sort Tzahor, Shani
collection PubMed
description BACKGROUND: Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD) were found in cyanophage genomes. This phenomenon suggested that the horizontal transfer of these genes may be involved in increasing phage fitness. To date, a very small percentage of marine bacteria and phages has been cultured. Thus, mapping genomic data extracted directly from the environment to its taxonomic origin is necessary for a better understanding of phage-host relationships and dynamics. RESULTS: To achieve an accurate and rapid taxonomic classification, we employed a computational approach combining a multi-class Support Vector Machine (SVM) with a codon usage position specific scoring matrix (cuPSSM). Our method has been applied successfully to classify core-photosystem-II gene fragments, including partial sequences coming directly from the ocean, to seven different taxonomic classes. Applying the method on a large set of DNA and RNA psbA clones from the Mediterranean Sea, we studied the distribution of cyanobacterial psbA genes and transcripts in their natural environment. Using our approach, we were able to simultaneously examine taxonomic and ecological distributions in the marine environment. CONCLUSION: The ability to accurately classify the origin of individual genes and transcripts coming directly from the environment is of great importance in studying marine ecology. The classification method presented in this paper could be applied further to classify other genes amplified from the environment, for which training data is available.
format Text
id pubmed-2696472
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26964722009-06-16 A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment Tzahor, Shani Man-Aharonovich, Dikla Kirkup, Benjamin C Yogev, Tali Berman-Frank, Ilana Polz, Martin F Béjà, Oded Mandel-Gutfreund, Yael BMC Genomics Research Article BACKGROUND: Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD) were found in cyanophage genomes. This phenomenon suggested that the horizontal transfer of these genes may be involved in increasing phage fitness. To date, a very small percentage of marine bacteria and phages has been cultured. Thus, mapping genomic data extracted directly from the environment to its taxonomic origin is necessary for a better understanding of phage-host relationships and dynamics. RESULTS: To achieve an accurate and rapid taxonomic classification, we employed a computational approach combining a multi-class Support Vector Machine (SVM) with a codon usage position specific scoring matrix (cuPSSM). Our method has been applied successfully to classify core-photosystem-II gene fragments, including partial sequences coming directly from the ocean, to seven different taxonomic classes. Applying the method on a large set of DNA and RNA psbA clones from the Mediterranean Sea, we studied the distribution of cyanobacterial psbA genes and transcripts in their natural environment. Using our approach, we were able to simultaneously examine taxonomic and ecological distributions in the marine environment. CONCLUSION: The ability to accurately classify the origin of individual genes and transcripts coming directly from the environment is of great importance in studying marine ecology. The classification method presented in this paper could be applied further to classify other genes amplified from the environment, for which training data is available. BioMed Central 2009-05-16 /pmc/articles/PMC2696472/ /pubmed/19445709 http://dx.doi.org/10.1186/1471-2164-10-229 Text en Copyright © 2009 Tzahor et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tzahor, Shani
Man-Aharonovich, Dikla
Kirkup, Benjamin C
Yogev, Tali
Berman-Frank, Ilana
Polz, Martin F
Béjà, Oded
Mandel-Gutfreund, Yael
A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title_full A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title_fullStr A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title_full_unstemmed A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title_short A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment
title_sort supervised learning approach for taxonomic classification of core-photosystem-ii genes and transcripts in the marine environment
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696472/
https://www.ncbi.nlm.nih.gov/pubmed/19445709
http://dx.doi.org/10.1186/1471-2164-10-229
work_keys_str_mv AT tzahorshani asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT manaharonovichdikla asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT kirkupbenjaminc asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT yogevtali asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT bermanfrankilana asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT polzmartinf asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT bejaoded asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT mandelgutfreundyael asupervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT tzahorshani supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT manaharonovichdikla supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT kirkupbenjaminc supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT yogevtali supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT bermanfrankilana supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT polzmartinf supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT bejaoded supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment
AT mandelgutfreundyael supervisedlearningapproachfortaxonomicclassificationofcorephotosystemiigenesandtranscriptsinthemarineenvironment