Cargando…

New methods for separating causes from effects in genomics data

BACKGROUND: The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasibl...

Descripción completa

Detalles Bibliográficos
Autores principales: Statnikov, Alexander, Henaff, Mikael, Lytkin, Nikita I, Aliferis, Constantin F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3535696/
https://www.ncbi.nlm.nih.gov/pubmed/23282373
http://dx.doi.org/10.1186/1471-2164-13-S8-S22
_version_ 1782254699125145600
author Statnikov, Alexander
Henaff, Mikael
Lytkin, Nikita I
Aliferis, Constantin F
author_facet Statnikov, Alexander
Henaff, Mikael
Lytkin, Nikita I
Aliferis, Constantin F
author_sort Statnikov, Alexander
collection PubMed
description BACKGROUND: The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. RESULTS: We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. CONCLUSIONS: This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.
format Online
Article
Text
id pubmed-3535696
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35356962013-01-04 New methods for separating causes from effects in genomics data Statnikov, Alexander Henaff, Mikael Lytkin, Nikita I Aliferis, Constantin F BMC Genomics Research BACKGROUND: The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. RESULTS: We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. CONCLUSIONS: This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical. BioMed Central 2012-12-17 /pmc/articles/PMC3535696/ /pubmed/23282373 http://dx.doi.org/10.1186/1471-2164-13-S8-S22 Text en Copyright ©2012 Statnikov et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Statnikov, Alexander
Henaff, Mikael
Lytkin, Nikita I
Aliferis, Constantin F
New methods for separating causes from effects in genomics data
title New methods for separating causes from effects in genomics data
title_full New methods for separating causes from effects in genomics data
title_fullStr New methods for separating causes from effects in genomics data
title_full_unstemmed New methods for separating causes from effects in genomics data
title_short New methods for separating causes from effects in genomics data
title_sort new methods for separating causes from effects in genomics data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3535696/
https://www.ncbi.nlm.nih.gov/pubmed/23282373
http://dx.doi.org/10.1186/1471-2164-13-S8-S22
work_keys_str_mv AT statnikovalexander newmethodsforseparatingcausesfromeffectsingenomicsdata
AT henaffmikael newmethodsforseparatingcausesfromeffectsingenomicsdata
AT lytkinnikitai newmethodsforseparatingcausesfromeffectsingenomicsdata
AT aliferisconstantinf newmethodsforseparatingcausesfromeffectsingenomicsdata