Cargando…

Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma

Motivation: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retri...

Descripción completa

Detalles Bibliográficos
Autores principales: Caldas, José, Gehlenborg, Nils, Kettunen, Eeva, Faisal, Ali, Rönty, Mikko, Nicholson, Andrew G., Knuutila, Sakari, Brazma, Alvis, Kaski, Samuel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3259436/
https://www.ncbi.nlm.nih.gov/pubmed/22106335
http://dx.doi.org/10.1093/bioinformatics/btr634
_version_ 1782221388651692032
author Caldas, José
Gehlenborg, Nils
Kettunen, Eeva
Faisal, Ali
Rönty, Mikko
Nicholson, Andrew G.
Knuutila, Sakari
Brazma, Alvis
Kaski, Samuel
author_facet Caldas, José
Gehlenborg, Nils
Kettunen, Eeva
Faisal, Ali
Rönty, Mikko
Nicholson, Andrew G.
Knuutila, Sakari
Brazma, Alvis
Kaski, Samuel
author_sort Caldas, José
collection PubMed
description Motivation: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights. Results: We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples. Availability: Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex. Contact: samuel.kaski@aalto.fi Supplementary Information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3259436
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-32594362012-01-17 Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma Caldas, José Gehlenborg, Nils Kettunen, Eeva Faisal, Ali Rönty, Mikko Nicholson, Andrew G. Knuutila, Sakari Brazma, Alvis Kaski, Samuel Bioinformatics Original Papers Motivation: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights. Results: We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples. Availability: Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex. Contact: samuel.kaski@aalto.fi Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-01-15 2011-11-20 /pmc/articles/PMC3259436/ /pubmed/22106335 http://dx.doi.org/10.1093/bioinformatics/btr634 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Caldas, José
Gehlenborg, Nils
Kettunen, Eeva
Faisal, Ali
Rönty, Mikko
Nicholson, Andrew G.
Knuutila, Sakari
Brazma, Alvis
Kaski, Samuel
Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title_full Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title_fullStr Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title_full_unstemmed Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title_short Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma
title_sort data-driven information retrieval in heterogeneous collections of transcriptomics data links sim2s to malignant pleural mesothelioma
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3259436/
https://www.ncbi.nlm.nih.gov/pubmed/22106335
http://dx.doi.org/10.1093/bioinformatics/btr634
work_keys_str_mv AT caldasjose datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT gehlenborgnils datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT kettuneneeva datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT faisalali datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT rontymikko datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT nicholsonandrewg datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT knuutilasakari datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT brazmaalvis datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma
AT kaskisamuel datadriveninformationretrievalinheterogeneouscollectionsoftranscriptomicsdatalinkssim2stomalignantpleuralmesothelioma