Cargando…

PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains

BACKGROUND: Linking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances). Thus,...

Descripción completa

Detalles Bibliográficos
Autores principales: Bayjanov, Jumamurat R, Molenaar, Douwe, Tzeneva, Vesela, Siezen, Roland J, van Hijum, Sacha A F T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3366882/
https://www.ncbi.nlm.nih.gov/pubmed/22559291
http://dx.doi.org/10.1186/1471-2164-13-170
_version_ 1782234778928414720
author Bayjanov, Jumamurat R
Molenaar, Douwe
Tzeneva, Vesela
Siezen, Roland J
van Hijum, Sacha A F T
author_facet Bayjanov, Jumamurat R
Molenaar, Douwe
Tzeneva, Vesela
Siezen, Roland J
van Hijum, Sacha A F T
author_sort Bayjanov, Jumamurat R
collection PubMed
description BACKGROUND: Linking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances). Thus, associating phenotypes to ~omics data requires an approach that is robust to noise and can handle large and diverse data sets. RESULTS: We developed a web-tool PhenoLink (http://bamics2.cmbi.ru.nl/websoftware/phenolink/) that links phenotype to ~omics data sets using well-established as well new techniques. PhenoLink imputes missing values and preprocesses input data (i) to decrease inherent noise in the data and (ii) to counterbalance pitfalls of the Random Forest algorithm, on which feature (e.g., gene) selection is based. Preprocessed data is used in feature (e.g., gene) selection to identify relations to phenotypes. We applied PhenoLink to identify gene-phenotype relations based on the presence/absence of 2847 genes in 42 Lactobacillus plantarum strains and phenotypic measurements of these strains in several experimental conditions, including growth on sugars and nitrogen-dioxide production. Genes were ranked based on their importance (predictive value) to correctly predict the phenotype of a given strain. In addition to known gene to phenotype relations we also found novel relations. CONCLUSIONS: PhenoLink is an easily accessible web-tool to facilitate identifying relations from large and often noisy phenotype and ~omics datasets. Visualization of links to phenotypes offered in PhenoLink allows prioritizing links, finding relations between features, finding relations between phenotypes, and identifying outliers in phenotype data. PhenoLink can be used to uncover phenotype links to a multitude of ~omics data, e.g., gene presence/absence (determined by e.g.: CGH or next-generation sequencing), gene expression (determined by e.g.: microarrays or RNA-seq), or metabolite abundance (determined by e.g.: GC-MS).
format Online
Article
Text
id pubmed-3366882
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33668822012-06-05 PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains Bayjanov, Jumamurat R Molenaar, Douwe Tzeneva, Vesela Siezen, Roland J van Hijum, Sacha A F T BMC Genomics Methodology Article BACKGROUND: Linking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances). Thus, associating phenotypes to ~omics data requires an approach that is robust to noise and can handle large and diverse data sets. RESULTS: We developed a web-tool PhenoLink (http://bamics2.cmbi.ru.nl/websoftware/phenolink/) that links phenotype to ~omics data sets using well-established as well new techniques. PhenoLink imputes missing values and preprocesses input data (i) to decrease inherent noise in the data and (ii) to counterbalance pitfalls of the Random Forest algorithm, on which feature (e.g., gene) selection is based. Preprocessed data is used in feature (e.g., gene) selection to identify relations to phenotypes. We applied PhenoLink to identify gene-phenotype relations based on the presence/absence of 2847 genes in 42 Lactobacillus plantarum strains and phenotypic measurements of these strains in several experimental conditions, including growth on sugars and nitrogen-dioxide production. Genes were ranked based on their importance (predictive value) to correctly predict the phenotype of a given strain. In addition to known gene to phenotype relations we also found novel relations. CONCLUSIONS: PhenoLink is an easily accessible web-tool to facilitate identifying relations from large and often noisy phenotype and ~omics datasets. Visualization of links to phenotypes offered in PhenoLink allows prioritizing links, finding relations between features, finding relations between phenotypes, and identifying outliers in phenotype data. PhenoLink can be used to uncover phenotype links to a multitude of ~omics data, e.g., gene presence/absence (determined by e.g.: CGH or next-generation sequencing), gene expression (determined by e.g.: microarrays or RNA-seq), or metabolite abundance (determined by e.g.: GC-MS). BioMed Central 2012-05-04 /pmc/articles/PMC3366882/ /pubmed/22559291 http://dx.doi.org/10.1186/1471-2164-13-170 Text en Copyright ©2012 Bayjanov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Bayjanov, Jumamurat R
Molenaar, Douwe
Tzeneva, Vesela
Siezen, Roland J
van Hijum, Sacha A F T
PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title_full PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title_fullStr PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title_full_unstemmed PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title_short PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
title_sort phenolink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for lactobacillus plantarum strains
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3366882/
https://www.ncbi.nlm.nih.gov/pubmed/22559291
http://dx.doi.org/10.1186/1471-2164-13-170
work_keys_str_mv AT bayjanovjumamuratr phenolinkawebtoolforlinkingphenotypetoomicsdataforbacteriaapplicationtogenetraitmatchingforlactobacillusplantarumstrains
AT molenaardouwe phenolinkawebtoolforlinkingphenotypetoomicsdataforbacteriaapplicationtogenetraitmatchingforlactobacillusplantarumstrains
AT tzenevavesela phenolinkawebtoolforlinkingphenotypetoomicsdataforbacteriaapplicationtogenetraitmatchingforlactobacillusplantarumstrains
AT siezenrolandj phenolinkawebtoolforlinkingphenotypetoomicsdataforbacteriaapplicationtogenetraitmatchingforlactobacillusplantarumstrains
AT vanhijumsachaaft phenolinkawebtoolforlinkingphenotypetoomicsdataforbacteriaapplicationtogenetraitmatchingforlactobacillusplantarumstrains