Cargando…

Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis

The mainstream of research in genetics, epigenetics, and imaging data analysis focuses on statistical association or exploring statistical dependence between variables. Despite their significant progresses in genetic research, understanding the etiology and mechanism of complex phenotypes remains el...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiao, Rong, Lin, Nan, Hu, Zixin, Bennett, David A., Jin, Li, Xiong, Momiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127271/
https://www.ncbi.nlm.nih.gov/pubmed/30233639
http://dx.doi.org/10.3389/fgene.2018.00347
_version_ 1783353440593772544
author Jiao, Rong
Lin, Nan
Hu, Zixin
Bennett, David A.
Jin, Li
Xiong, Momiao
author_facet Jiao, Rong
Lin, Nan
Hu, Zixin
Bennett, David A.
Jin, Li
Xiong, Momiao
author_sort Jiao, Rong
collection PubMed
description The mainstream of research in genetics, epigenetics, and imaging data analysis focuses on statistical association or exploring statistical dependence between variables. Despite their significant progresses in genetic research, understanding the etiology and mechanism of complex phenotypes remains elusive. Using association analysis as a major analytical platform for the complex data analysis is a key issue that hampers the theoretic development of genomic science and its application in practice. Causal inference is an essential component for the discovery of mechanical relationships among complex phenotypes. Many researchers suggest making the transition from association to causation. Despite its fundamental role in science, engineering, and biomedicine, the traditional methods for causal inference require at least three variables. However, quantitative genetic analysis such as QTL, eQTL, mQTL, and genomic-imaging data analysis requires exploring the causal relationships between two variables. This paper will focus on bivariate causal discovery with continuous variables. We will introduce independence of cause and mechanism (ICM) as a basic principle for causal inference, algorithmic information theory and additive noise model (ANM) as major tools for bivariate causal discovery. Large-scale simulations will be performed to evaluate the feasibility of the ANM for bivariate causal discovery. To further evaluate their performance for causal inference, the ANM will be applied to the construction of gene regulatory networks. Also, the ANM will be applied to trait-imaging data analysis to illustrate three scenarios: presence of both causation and association, presence of association while absence of causation, and presence of causation, while lack of association between two variables. Telling cause from effect between two continuous variables from observational data is one of the fundamental and challenging problems in omics and imaging data analysis. Our preliminary simulations and real data analysis will show that the ANMs will be one of choice for bivariate causal discovery in genomic and imaging data analysis.
format Online
Article
Text
id pubmed-6127271
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-61272712018-09-19 Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis Jiao, Rong Lin, Nan Hu, Zixin Bennett, David A. Jin, Li Xiong, Momiao Front Genet Genetics The mainstream of research in genetics, epigenetics, and imaging data analysis focuses on statistical association or exploring statistical dependence between variables. Despite their significant progresses in genetic research, understanding the etiology and mechanism of complex phenotypes remains elusive. Using association analysis as a major analytical platform for the complex data analysis is a key issue that hampers the theoretic development of genomic science and its application in practice. Causal inference is an essential component for the discovery of mechanical relationships among complex phenotypes. Many researchers suggest making the transition from association to causation. Despite its fundamental role in science, engineering, and biomedicine, the traditional methods for causal inference require at least three variables. However, quantitative genetic analysis such as QTL, eQTL, mQTL, and genomic-imaging data analysis requires exploring the causal relationships between two variables. This paper will focus on bivariate causal discovery with continuous variables. We will introduce independence of cause and mechanism (ICM) as a basic principle for causal inference, algorithmic information theory and additive noise model (ANM) as major tools for bivariate causal discovery. Large-scale simulations will be performed to evaluate the feasibility of the ANM for bivariate causal discovery. To further evaluate their performance for causal inference, the ANM will be applied to the construction of gene regulatory networks. Also, the ANM will be applied to trait-imaging data analysis to illustrate three scenarios: presence of both causation and association, presence of association while absence of causation, and presence of causation, while lack of association between two variables. Telling cause from effect between two continuous variables from observational data is one of the fundamental and challenging problems in omics and imaging data analysis. Our preliminary simulations and real data analysis will show that the ANMs will be one of choice for bivariate causal discovery in genomic and imaging data analysis. Frontiers Media S.A. 2018-08-31 /pmc/articles/PMC6127271/ /pubmed/30233639 http://dx.doi.org/10.3389/fgene.2018.00347 Text en Copyright © 2018 Jiao, Lin, Hu, Bennett, Jin and Xiong. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jiao, Rong
Lin, Nan
Hu, Zixin
Bennett, David A.
Jin, Li
Xiong, Momiao
Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title_full Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title_fullStr Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title_full_unstemmed Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title_short Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
title_sort bivariate causal discovery and its applications to gene expression and imaging data analysis
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127271/
https://www.ncbi.nlm.nih.gov/pubmed/30233639
http://dx.doi.org/10.3389/fgene.2018.00347
work_keys_str_mv AT jiaorong bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis
AT linnan bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis
AT huzixin bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis
AT bennettdavida bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis
AT jinli bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis
AT xiongmomiao bivariatecausaldiscoveryanditsapplicationstogeneexpressionandimagingdataanalysis