Cargando…
Matching experiments across species using expression values and textual information
Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression co...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371837/ https://www.ncbi.nlm.nih.gov/pubmed/22689770 http://dx.doi.org/10.1093/bioinformatics/bts205 |
_version_ | 1782235267477798912 |
---|---|
author | Wise, Aaron Oltvai, Zoltán N. Bar−Joseph, Ziv |
author_facet | Wise, Aaron Oltvai, Zoltán N. Bar−Joseph, Ziv |
author_sort | Wise, Aaron |
collection | PubMed |
description | Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species. Results: To enable better cross-species comparisons, we developed methods for automatically identifying pairs of similar expression datasets across species. Our method uses a co-training algorithm to combine a model of expression similarity with a model of the text which accompanies the expression experiments. The co-training method outperforms previous methods based on expression similarity alone. Using expert analysis, we show that the new matches identified by our method indeed capture biological similarities across species. We then use the matched expression pairs between human and mouse to recover known and novel cycling genes as well as to identify genes with possible involvement in diabetes. By providing the ability to identify novel candidate genes in model organisms, our method opens the door to new models for studying diseases. Availability: Source code and supplementary information is available at: www.andrew.cmu.edu/user/aaronwis/cotrain12. Contact: zivbj@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-3371837 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-33718372012-06-11 Matching experiments across species using expression values and textual information Wise, Aaron Oltvai, Zoltán N. Bar−Joseph, Ziv Bioinformatics Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species. Results: To enable better cross-species comparisons, we developed methods for automatically identifying pairs of similar expression datasets across species. Our method uses a co-training algorithm to combine a model of expression similarity with a model of the text which accompanies the expression experiments. The co-training method outperforms previous methods based on expression similarity alone. Using expert analysis, we show that the new matches identified by our method indeed capture biological similarities across species. We then use the matched expression pairs between human and mouse to recover known and novel cycling genes as well as to identify genes with possible involvement in diabetes. By providing the ability to identify novel candidate genes in model organisms, our method opens the door to new models for studying diseases. Availability: Source code and supplementary information is available at: www.andrew.cmu.edu/user/aaronwis/cotrain12. Contact: zivbj@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-06-15 2012-06-09 /pmc/articles/PMC3371837/ /pubmed/22689770 http://dx.doi.org/10.1093/bioinformatics/bts205 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Wise, Aaron Oltvai, Zoltán N. Bar−Joseph, Ziv Matching experiments across species using expression values and textual information |
title | Matching experiments across species using expression values and textual information |
title_full | Matching experiments across species using expression values and textual information |
title_fullStr | Matching experiments across species using expression values and textual information |
title_full_unstemmed | Matching experiments across species using expression values and textual information |
title_short | Matching experiments across species using expression values and textual information |
title_sort | matching experiments across species using expression values and textual information |
topic | Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371837/ https://www.ncbi.nlm.nih.gov/pubmed/22689770 http://dx.doi.org/10.1093/bioinformatics/bts205 |
work_keys_str_mv | AT wiseaaron matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation AT oltvaizoltann matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation AT barjosephziv matchingexperimentsacrossspeciesusingexpressionvaluesandtextualinformation |