Cargando…

Matching experiments across species using expression values and textual information

Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression co...

Descripción completa

Detalles Bibliográficos
Autores principales: Wise, Aaron, Oltvai, Zoltán N., Bar−Joseph, Ziv
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371837/
https://www.ncbi.nlm.nih.gov/pubmed/22689770
http://dx.doi.org/10.1093/bioinformatics/bts205
_version_ 1782235267477798912
author Wise, Aaron
Oltvai, Zoltán N.
Bar−Joseph, Ziv
author_facet Wise, Aaron
Oltvai, Zoltán N.
Bar−Joseph, Ziv
author_sort Wise, Aaron
collection PubMed
description Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species. Results: To enable better cross-species comparisons, we developed methods for automatically identifying pairs of similar expression datasets across species. Our method uses a co-training algorithm to combine a model of expression similarity with a model of the text which accompanies the expression experiments. The co-training method outperforms previous methods based on expression similarity alone. Using expert analysis, we show that the new matches identified by our method indeed capture biological similarities across species. We then use the matched expression pairs between human and mouse to recover known and novel cycling genes as well as to identify genes with possible involvement in diabetes. By providing the ability to identify novel candidate genes in model organisms, our method opens the door to new models for studying diseases. Availability: Source code and supplementary information is available at: www.andrew.cmu.edu/user/aaronwis/cotrain12. Contact: zivbj@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3371837
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33718372012-06-11 Matching experiments across species using expression values and textual information Wise, Aaron Oltvai, Zoltán N. Bar−Joseph, Ziv Bioinformatics Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species. Results: To enable better cross-species comparisons, we developed methods for automatically identifying pairs of similar expression datasets across species. Our method uses a co-training algorithm to combine a model of expression similarity with a model of the text which accompanies the expression experiments. The co-training method outperforms previous methods based on expression similarity alone. Using expert analysis, we show that the new matches identified by our method indeed capture biological similarities across species. We then use the matched expression pairs between human and mouse to recover known and novel cycling genes as well as to identify genes with possible involvement in diabetes. By providing the ability to identify novel candidate genes in model organisms, our method opens the door to new models for studying diseases. Availability: Source code and supplementary information is available at: www.andrew.cmu.edu/user/aaronwis/cotrain12. Contact: zivbj@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-06-15 2012-06-09 /pmc/articles/PMC3371837/ /pubmed/22689770 http://dx.doi.org/10.1093/bioinformatics/bts205 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
Wise, Aaron
Oltvai, Zoltán N.
Bar−Joseph, Ziv
Matching experiments across species using expression values and textual information
title Matching experiments across species using expression values and textual information
title_full Matching experiments across species using expression values and textual information
title_fullStr Matching experiments across species using expression values and textual information
title_full_unstemmed Matching experiments across species using expression values and textual information
title_short Matching experiments across species using expression values and textual information
title_sort matching experiments across species using expression values and textual information
topic Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371837/
https://www.ncbi.nlm.nih.gov/pubmed/22689770
http://dx.doi.org/10.1093/bioinformatics/bts205
work_keys_str_mv AT wiseaaron matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation
AT oltvaizoltann matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation
AT barjosephziv matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation