Cargando…

Systematic Association of Genes to Phenotypes by Genome and Literature Mining

One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for...

Descripción completa

Detalles Bibliográficos
Autores principales: Korbel, Jan O, Doerks, Tobias, Jensen, Lars J, Perez-Iratxeta, Carolina, Kaczanowski, Szymon, Hooper, Sean D, Andrade, Miguel A, Bork, Peer
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1073694/
https://www.ncbi.nlm.nih.gov/pubmed/15799710
http://dx.doi.org/10.1371/journal.pbio.0030134
_version_ 1782123390463639552
author Korbel, Jan O
Doerks, Tobias
Jensen, Lars J
Perez-Iratxeta, Carolina
Kaczanowski, Szymon
Hooper, Sean D
Andrade, Miguel A
Bork, Peer
author_facet Korbel, Jan O
Doerks, Tobias
Jensen, Lars J
Perez-Iratxeta, Carolina
Kaczanowski, Szymon
Hooper, Sean D
Andrade, Miguel A
Bork, Peer
author_sort Korbel, Jan O
collection PubMed
description One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for associating genes and phenotypic characteristics that combines literature mining with comparative genome analysis. We first mine the MEDLINE literature database for terms that reflect phenotypic similarities of species. Subsequently we predict the likely genomic determinants: genes specifically present in the respective genomes. In a global analysis involving 92 prokaryotic genomes we retrieve 323 clusters containing a total of 2,700 significant gene–phenotype associations. Some clusters contain mostly known relationships, such as genes involved in motility or plant degradation, often with additional hypothetical proteins associated with those phenotypes. Other clusters comprise unexpected associations; for example, a group of terms related to food and spoilage is linked to genes predicted to be involved in bacterial food poisoning. Among the clusters, we observe an enrichment of pathogenicity-related associations, suggesting that the approach reveals many novel genes likely to play a role in infectious diseases.
format Text
id pubmed-1073694
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-10736942005-04-05 Systematic Association of Genes to Phenotypes by Genome and Literature Mining Korbel, Jan O Doerks, Tobias Jensen, Lars J Perez-Iratxeta, Carolina Kaczanowski, Szymon Hooper, Sean D Andrade, Miguel A Bork, Peer PLoS Biol Research Article One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for associating genes and phenotypic characteristics that combines literature mining with comparative genome analysis. We first mine the MEDLINE literature database for terms that reflect phenotypic similarities of species. Subsequently we predict the likely genomic determinants: genes specifically present in the respective genomes. In a global analysis involving 92 prokaryotic genomes we retrieve 323 clusters containing a total of 2,700 significant gene–phenotype associations. Some clusters contain mostly known relationships, such as genes involved in motility or plant degradation, often with additional hypothetical proteins associated with those phenotypes. Other clusters comprise unexpected associations; for example, a group of terms related to food and spoilage is linked to genes predicted to be involved in bacterial food poisoning. Among the clusters, we observe an enrichment of pathogenicity-related associations, suggesting that the approach reveals many novel genes likely to play a role in infectious diseases. Public Library of Science 2005-05 2005-04-05 /pmc/articles/PMC1073694/ /pubmed/15799710 http://dx.doi.org/10.1371/journal.pbio.0030134 Text en Copyright: © 2005 Korbel et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Korbel, Jan O
Doerks, Tobias
Jensen, Lars J
Perez-Iratxeta, Carolina
Kaczanowski, Szymon
Hooper, Sean D
Andrade, Miguel A
Bork, Peer
Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title_full Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title_fullStr Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title_full_unstemmed Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title_short Systematic Association of Genes to Phenotypes by Genome and Literature Mining
title_sort systematic association of genes to phenotypes by genome and literature mining
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1073694/
https://www.ncbi.nlm.nih.gov/pubmed/15799710
http://dx.doi.org/10.1371/journal.pbio.0030134
work_keys_str_mv AT korbeljano systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT doerkstobias systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT jensenlarsj systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT pereziratxetacarolina systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT kaczanowskiszymon systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT hooperseand systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT andrademiguela systematicassociationofgenestophenotypesbygenomeandliteraturemining
AT borkpeer systematicassociationofgenestophenotypesbygenomeandliteraturemining