Cargando…
Graphical Causal Models and Imputing Missing Data: A Preliminary Study
Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this pape...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274349/ http://dx.doi.org/10.1007/978-3-030-50146-4_36 |
_version_ | 1783542562435366912 |
---|---|
author | Almeida, Rui Jorge Adriaans, Greetje Shapovalova, Yuliya |
author_facet | Almeida, Rui Jorge Adriaans, Greetje Shapovalova, Yuliya |
author_sort | Almeida, Rui Jorge |
collection | PubMed |
description | Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this paper, we investigate if graphical causal models can be used to impute missing values and derive additional information on the uncertainty of the imputed values. Our goal is to use the information from a complete dataset in the form of graphical causal models to impute missing values in an incomplete dataset. This assumes that the datasets have the same data generating process. Furthermore, we calculate the probability of each missing data value belonging to a specified percentile. We present a preliminary study on the proposed method using synthetic data, where we can control the causal relations and missing values. |
format | Online Article Text |
id | pubmed-7274349 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72743492020-06-05 Graphical Causal Models and Imputing Missing Data: A Preliminary Study Almeida, Rui Jorge Adriaans, Greetje Shapovalova, Yuliya Information Processing and Management of Uncertainty in Knowledge-Based Systems Article Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this paper, we investigate if graphical causal models can be used to impute missing values and derive additional information on the uncertainty of the imputed values. Our goal is to use the information from a complete dataset in the form of graphical causal models to impute missing values in an incomplete dataset. This assumes that the datasets have the same data generating process. Furthermore, we calculate the probability of each missing data value belonging to a specified percentile. We present a preliminary study on the proposed method using synthetic data, where we can control the causal relations and missing values. 2020-05-18 /pmc/articles/PMC7274349/ http://dx.doi.org/10.1007/978-3-030-50146-4_36 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Almeida, Rui Jorge Adriaans, Greetje Shapovalova, Yuliya Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title | Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title_full | Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title_fullStr | Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title_full_unstemmed | Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title_short | Graphical Causal Models and Imputing Missing Data: A Preliminary Study |
title_sort | graphical causal models and imputing missing data: a preliminary study |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274349/ http://dx.doi.org/10.1007/978-3-030-50146-4_36 |
work_keys_str_mv | AT almeidaruijorge graphicalcausalmodelsandimputingmissingdataapreliminarystudy AT adriaansgreetje graphicalcausalmodelsandimputingmissingdataapreliminarystudy AT shapovalovayuliya graphicalcausalmodelsandimputingmissingdataapreliminarystudy |