Cargando…

Causality on longitudinal data: Stable specification search in constrained structural equation modeling

A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on r...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahmadi, Ridho, Groot, Perry, van Rijn, Marieke HC, van den Brand, Jan AJG, Heins, Marianne, Knoop, Hans, Heskes, Tom
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249641/
https://www.ncbi.nlm.nih.gov/pubmed/28657454
http://dx.doi.org/10.1177/0962280217713347
_version_ 1783372786229575680
author Rahmadi, Ridho
Groot, Perry
van Rijn, Marieke HC
van den Brand, Jan AJG
Heins, Marianne
Knoop, Hans
Heskes, Tom
author_facet Rahmadi, Ridho
Groot, Perry
van Rijn, Marieke HC
van den Brand, Jan AJG
Heins, Marianne
Knoop, Hans
Heskes, Tom
author_sort Rahmadi, Ridho
collection PubMed
description A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on recent advances in stability selection using subsampling and selection algorithms. Our approach uses exploratory search but allows incorporation of prior knowledge, e.g., the absence of a particular causal relationship between two specific variables. We represent causal relationships using structural equation models. Models are scored along two objectives: the model fit and the model complexity. Since both objectives are often conflicting, we apply a multi-objective evolutionary algorithm to search for Pareto optimal models. To handle the instability of small finite data samples, we repeatedly subsample the data and select those substructures (from the optimal models) that are both stable and parsimonious. These substructures can be visualized through a causal graph. Our more exploratory approach achieves at least comparable performance as, but often a significant improvement over state-of-the-art alternative approaches on a simulated data set with a known ground truth. We also present the results of our method on three real-world longitudinal data sets on chronic fatigue syndrome, Alzheimer disease, and chronic kidney disease. The findings obtained with our approach are generally in line with results from more hypothesis-driven analyses in earlier studies and suggest some novel relationships that deserve further research.
format Online
Article
Text
id pubmed-6249641
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-62496412018-12-17 Causality on longitudinal data: Stable specification search in constrained structural equation modeling Rahmadi, Ridho Groot, Perry van Rijn, Marieke HC van den Brand, Jan AJG Heins, Marianne Knoop, Hans Heskes, Tom Stat Methods Med Res Articles A typical problem in causal modeling is the instability of model structure learning, i.e., small changes in finite data can result in completely different optimal models. The present work introduces a novel causal modeling algorithm for longitudinal data, that is robust for finite samples based on recent advances in stability selection using subsampling and selection algorithms. Our approach uses exploratory search but allows incorporation of prior knowledge, e.g., the absence of a particular causal relationship between two specific variables. We represent causal relationships using structural equation models. Models are scored along two objectives: the model fit and the model complexity. Since both objectives are often conflicting, we apply a multi-objective evolutionary algorithm to search for Pareto optimal models. To handle the instability of small finite data samples, we repeatedly subsample the data and select those substructures (from the optimal models) that are both stable and parsimonious. These substructures can be visualized through a causal graph. Our more exploratory approach achieves at least comparable performance as, but often a significant improvement over state-of-the-art alternative approaches on a simulated data set with a known ground truth. We also present the results of our method on three real-world longitudinal data sets on chronic fatigue syndrome, Alzheimer disease, and chronic kidney disease. The findings obtained with our approach are generally in line with results from more hypothesis-driven analyses in earlier studies and suggest some novel relationships that deserve further research. SAGE Publications 2017-06-28 2018-12 /pmc/articles/PMC6249641/ /pubmed/28657454 http://dx.doi.org/10.1177/0962280217713347 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Articles
Rahmadi, Ridho
Groot, Perry
van Rijn, Marieke HC
van den Brand, Jan AJG
Heins, Marianne
Knoop, Hans
Heskes, Tom
Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title_full Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title_fullStr Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title_full_unstemmed Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title_short Causality on longitudinal data: Stable specification search in constrained structural equation modeling
title_sort causality on longitudinal data: stable specification search in constrained structural equation modeling
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249641/
https://www.ncbi.nlm.nih.gov/pubmed/28657454
http://dx.doi.org/10.1177/0962280217713347
work_keys_str_mv AT rahmadiridho causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT grootperry causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT vanrijnmariekehc causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT vandenbrandjanajg causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT heinsmarianne causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT knoophans causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT heskestom causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling
AT causalityonlongitudinaldatastablespecificationsearchinconstrainedstructuralequationmodeling