Cargando…

Graphical Causal Models and Imputing Missing Data: A Preliminary Study

Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this pape...

Descripción completa

Detalles Bibliográficos
Autores principales: Almeida, Rui Jorge, Adriaans, Greetje, Shapovalova, Yuliya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274349/
http://dx.doi.org/10.1007/978-3-030-50146-4_36
_version_ 1783542562435366912
author Almeida, Rui Jorge
Adriaans, Greetje
Shapovalova, Yuliya
author_facet Almeida, Rui Jorge
Adriaans, Greetje
Shapovalova, Yuliya
author_sort Almeida, Rui Jorge
collection PubMed
description Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this paper, we investigate if graphical causal models can be used to impute missing values and derive additional information on the uncertainty of the imputed values. Our goal is to use the information from a complete dataset in the form of graphical causal models to impute missing values in an incomplete dataset. This assumes that the datasets have the same data generating process. Furthermore, we calculate the probability of each missing data value belonging to a specified percentile. We present a preliminary study on the proposed method using synthetic data, where we can control the causal relations and missing values.
format Online
Article
Text
id pubmed-7274349
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72743492020-06-05 Graphical Causal Models and Imputing Missing Data: A Preliminary Study Almeida, Rui Jorge Adriaans, Greetje Shapovalova, Yuliya Information Processing and Management of Uncertainty in Knowledge-Based Systems Article Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this paper, we investigate if graphical causal models can be used to impute missing values and derive additional information on the uncertainty of the imputed values. Our goal is to use the information from a complete dataset in the form of graphical causal models to impute missing values in an incomplete dataset. This assumes that the datasets have the same data generating process. Furthermore, we calculate the probability of each missing data value belonging to a specified percentile. We present a preliminary study on the proposed method using synthetic data, where we can control the causal relations and missing values. 2020-05-18 /pmc/articles/PMC7274349/ http://dx.doi.org/10.1007/978-3-030-50146-4_36 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Almeida, Rui Jorge
Adriaans, Greetje
Shapovalova, Yuliya
Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title_full Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title_fullStr Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title_full_unstemmed Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title_short Graphical Causal Models and Imputing Missing Data: A Preliminary Study
title_sort graphical causal models and imputing missing data: a preliminary study
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274349/
http://dx.doi.org/10.1007/978-3-030-50146-4_36
work_keys_str_mv AT almeidaruijorge graphicalcausalmodelsandimputingmissingdataapreliminarystudy
AT adriaansgreetje graphicalcausalmodelsandimputingmissingdataapreliminarystudy
AT shapovalovayuliya graphicalcausalmodelsandimputingmissingdataapreliminarystudy