Cargando…
Causality on longitudinal data: Stable specification search in constrained structural equation modeling
A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on r...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249641/ https://www.ncbi.nlm.nih.gov/pubmed/28657454 http://dx.doi.org/10.1177/0962280217713347 |
_version_ | 1783372786229575680 |
---|---|
author | Rahmadi, Ridho Groot, Perry van Rijn, Marieke HC van den Brand, Jan AJG Heins, Marianne Knoop, Hans Heskes, Tom |
author_facet | Rahmadi, Ridho Groot, Perry van Rijn, Marieke HC van den Brand, Jan AJG Heins, Marianne Knoop, Hans Heskes, Tom |
author_sort | Rahmadi, Ridho |
collection | PubMed |
description | A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on recent advances in stability selection using subsampling and selection algorithms. Our approach uses exploratory search but allows incorporation of prior knowledge, e.g., the absence of a particular causal relationship between two specific variables. We represent causal relationships using structural equation models. Models are scored along two objectives: the model fit and the model complexity. Since both objectives are often conflicting, we apply a multi-objective evolutionary algorithm to search for Pareto optimal models. To handle the instability of small finite data samples, we repeatedly subsample the data and select those substructures (from the optimal models) that are both stable and parsimonious. These substructures can be visualized through a causal graph. Our more exploratory approach achieves at least comparable performance as, but often a significant improvement over state-of-the-art alternative approaches on a simulated data set with a known ground truth. We also present the results of our method on three real-world longitudinal data sets on chronic fatigue syndrome, Alzheimer disease, and chronic kidney disease. The findings obtained with our approach are generally in line with results from more hypothesis-driven analyses in earlier studies and suggest some novel relationships that deserve further research. |
format | Online Article Text |
id | pubmed-6249641 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-62496412018-12-17 Causality on longitudinal data: Stable specification search in constrained structural equation modeling Rahmadi, Ridho Groot, Perry van Rijn, Marieke HC van den Brand, Jan AJG Heins, Marianne Knoop, Hans Heskes, Tom Stat Methods Med Res Articles A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on recent advances in stability selection using subsampling and selection algorithms. Our approach uses exploratory search but allows incorporation of prior knowledge, e.g., the absence of a particular causal relationship between two specific variables. We represent causal relationships using structural equation models. Models are scored along two objectives: the model fit and the model complexity. Since both objectives are often conflicting, we apply a multi-objective evolutionary algorithm to search for Pareto optimal models. To handle the instability of small finite data samples, we repeatedly subsample the data and select those substructures (from the optimal models) that are both stable and parsimonious. These substructures can be visualized through a causal graph. Our more exploratory approach achieves at least comparable performance as, but often a significant improvement over state-of-the-art alternative approaches on a simulated data set with a known ground truth. We also present the results of our method on three real-world longitudinal data sets on chronic fatigue syndrome, Alzheimer disease, and chronic kidney disease. The findings obtained with our approach are generally in line with results from more hypothesis-driven analyses in earlier studies and suggest some novel relationships that deserve further research. SAGE Publications 2017-06-28 2018-12 /pmc/articles/PMC6249641/ /pubmed/28657454 http://dx.doi.org/10.1177/0962280217713347 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Articles Rahmadi, Ridho Groot, Perry van Rijn, Marieke HC van den Brand, Jan AJG Heins, Marianne Knoop, Hans Heskes, Tom Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title | Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title_full | Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title_fullStr | Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title_full_unstemmed | Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title_short | Causality on longitudinal data: Stable specification search in constrained structural equation modeling |
title_sort | causality on longitudinal data: stable specification search in constrained structural equation modeling |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249641/ https://www.ncbi.nlm.nih.gov/pubmed/28657454 http://dx.doi.org/10.1177/0962280217713347 |
work_keys_str_mv | AT rahmadiridho causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT grootperry causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT vanrijnmariekehc causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT vandenbrandjanajg causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT heinsmarianne causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT knoophans causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT heskestom causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling |