Cargando…

Predicting phenotypic traits of prokaryotes from protein domain frequencies

BACKGROUND: Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when...

Descripción completa

Detalles Bibliográficos
Autores principales: Lingner, Thomas, Mühlhausen, Stefanie, Gabaldón, Toni, Notredame, Cedric, Meinicke, Peter
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2955703/
https://www.ncbi.nlm.nih.gov/pubmed/20868492
http://dx.doi.org/10.1186/1471-2105-11-481
_version_ 1782188073204842496
author Lingner, Thomas
Mühlhausen, Stefanie
Gabaldón, Toni
Notredame, Cedric
Meinicke, Peter
author_facet Lingner, Thomas
Mühlhausen, Stefanie
Gabaldón, Toni
Notredame, Cedric
Meinicke, Peter
author_sort Lingner, Thomas
collection PubMed
description BACKGROUND: Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when the availability of a genome sequence precedes experimental characterization, a scenario that is favored by the advent of novel high-throughput and single cell sequencing techniques. RESULTS: We present a novel approach to predict the phenotype of prokaryotes directly from their protein domain frequencies. Our discriminative machine learning approach provides high prediction accuracy of relevant phenotypes such as motility, oxygen requirement or spore formation. Moreover, the set of discriminative domains provides biological insight into the underlying phenotype-genotype relationship and enables deriving hypotheses on the possible functions of uncharacterized domains. CONCLUSIONS: Fast and accurate prediction of microbial phenotypes based on genomic protein domain content is feasible and has the potential to provide novel biological insights. First results of a systematic check for annotation errors indicate that our approach may also be applied to semi-automatic correction and completion of the existing phenotype annotation.
format Text
id pubmed-2955703
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29557032010-10-18 Predicting phenotypic traits of prokaryotes from protein domain frequencies Lingner, Thomas Mühlhausen, Stefanie Gabaldón, Toni Notredame, Cedric Meinicke, Peter BMC Bioinformatics Research Article BACKGROUND: Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when the availability of a genome sequence precedes experimental characterization, a scenario that is favored by the advent of novel high-throughput and single cell sequencing techniques. RESULTS: We present a novel approach to predict the phenotype of prokaryotes directly from their protein domain frequencies. Our discriminative machine learning approach provides high prediction accuracy of relevant phenotypes such as motility, oxygen requirement or spore formation. Moreover, the set of discriminative domains provides biological insight into the underlying phenotype-genotype relationship and enables deriving hypotheses on the possible functions of uncharacterized domains. CONCLUSIONS: Fast and accurate prediction of microbial phenotypes based on genomic protein domain content is feasible and has the potential to provide novel biological insights. First results of a systematic check for annotation errors indicate that our approach may also be applied to semi-automatic correction and completion of the existing phenotype annotation. BioMed Central 2010-09-24 /pmc/articles/PMC2955703/ /pubmed/20868492 http://dx.doi.org/10.1186/1471-2105-11-481 Text en Copyright ©2010 Lingner et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Lingner, Thomas
Mühlhausen, Stefanie
Gabaldón, Toni
Notredame, Cedric
Meinicke, Peter
Predicting phenotypic traits of prokaryotes from protein domain frequencies
title Predicting phenotypic traits of prokaryotes from protein domain frequencies
title_full Predicting phenotypic traits of prokaryotes from protein domain frequencies
title_fullStr Predicting phenotypic traits of prokaryotes from protein domain frequencies
title_full_unstemmed Predicting phenotypic traits of prokaryotes from protein domain frequencies
title_short Predicting phenotypic traits of prokaryotes from protein domain frequencies
title_sort predicting phenotypic traits of prokaryotes from protein domain frequencies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2955703/
https://www.ncbi.nlm.nih.gov/pubmed/20868492
http://dx.doi.org/10.1186/1471-2105-11-481
work_keys_str_mv AT lingnerthomas predictingphenotypictraitsofprokaryotesfromproteindomainfrequencies
AT muhlhausenstefanie predictingphenotypictraitsofprokaryotesfromproteindomainfrequencies
AT gabaldontoni predictingphenotypictraitsofprokaryotesfromproteindomainfrequencies
AT notredamecedric predictingphenotypictraitsofprokaryotesfromproteindomainfrequencies
AT meinickepeter predictingphenotypictraitsofprokaryotesfromproteindomainfrequencies