Cargando…

Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine

BACKGROUND: Causal structure learning refers to a process of identifying causal structures from observational data, and it can have multiple applications in biomedicine and health care. OBJECTIVE: This paper provides a practical review and tutorial on scalable causal structure learning models with e...

Descripción completa

Detalles Bibliográficos
Autores principales:	Upadhyaya, Pulakesh, Zhang, Kai, Li, Can, Jiang, Xiaoqian, Kim, Yejin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890349/ https://www.ncbi.nlm.nih.gov/pubmed/36649070 http://dx.doi.org/10.2196/38266

_version_	1784880930294333440
author	Upadhyaya, Pulakesh Zhang, Kai Li, Can Jiang, Xiaoqian Kim, Yejin
author_facet	Upadhyaya, Pulakesh Zhang, Kai Li, Can Jiang, Xiaoqian Kim, Yejin
author_sort	Upadhyaya, Pulakesh
collection	PubMed
description	BACKGROUND: Causal structure learning refers to a process of identifying causal structures from observational data, and it can have multiple applications in biomedicine and health care. OBJECTIVE: This paper provides a practical review and tutorial on scalable causal structure learning models with examples of real-world data to help health care audiences understand and apply them. METHODS: We reviewed traditional (combinatorial and score-based) methods for causal structure discovery and machine learning–based schemes. Various traditional approaches have been studied to tackle this problem, the most important among these being the Peter Spirtes and Clark Glymour algorithms. This was followed by analyzing the literature on score-based methods, which are computationally faster. Owing to the continuous constraint on acyclicity, there are new deep learning approaches to the problem in addition to traditional and score-based methods. Such methods can also offer scalability, particularly when there is a large amount of data involving multiple variables. Using our own evaluation metrics and experiments on linear, nonlinear, and benchmark Sachs data, we aimed to highlight the various advantages and disadvantages associated with these methods for the health care community. We also highlighted recent developments in biomedicine where causal structure learning can be applied to discover structures such as gene networks, brain connectivity networks, and those in cancer epidemiology. RESULTS: We also compared the performance of traditional and machine learning–based algorithms for causal discovery over some benchmark data sets. Directed Acyclic Graph-Graph Neural Network has the lowest structural hamming distance (19) and false positive rate (0.13) based on the Sachs data set, whereas Greedy Equivalence Search and Max-Min Hill Climbing have the best false discovery rate (0.68) and true positive rate (0.56), respectively. CONCLUSIONS: Machine learning–based approaches, including deep learning, have many advantages over traditional approaches, such as scalability, including a greater number of variables, and potentially being applied in a wide range of biomedical applications, such as genetics, if sufficient data are available. Furthermore, these models are more flexible than traditional models and are poised to positively affect many applications in the future.
format	Online Article Text
id	pubmed-9890349
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-98903492023-02-02 Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine Upadhyaya, Pulakesh Zhang, Kai Li, Can Jiang, Xiaoqian Kim, Yejin JMIR Med Inform Review BACKGROUND: Causal structure learning refers to a process of identifying causal structures from observational data, and it can have multiple applications in biomedicine and health care. OBJECTIVE: This paper provides a practical review and tutorial on scalable causal structure learning models with examples of real-world data to help health care audiences understand and apply them. METHODS: We reviewed traditional (combinatorial and score-based) methods for causal structure discovery and machine learning–based schemes. Various traditional approaches have been studied to tackle this problem, the most important among these being the Peter Spirtes and Clark Glymour algorithms. This was followed by analyzing the literature on score-based methods, which are computationally faster. Owing to the continuous constraint on acyclicity, there are new deep learning approaches to the problem in addition to traditional and score-based methods. Such methods can also offer scalability, particularly when there is a large amount of data involving multiple variables. Using our own evaluation metrics and experiments on linear, nonlinear, and benchmark Sachs data, we aimed to highlight the various advantages and disadvantages associated with these methods for the health care community. We also highlighted recent developments in biomedicine where causal structure learning can be applied to discover structures such as gene networks, brain connectivity networks, and those in cancer epidemiology. RESULTS: We also compared the performance of traditional and machine learning–based algorithms for causal discovery over some benchmark data sets. Directed Acyclic Graph-Graph Neural Network has the lowest structural hamming distance (19) and false positive rate (0.13) based on the Sachs data set, whereas Greedy Equivalence Search and Max-Min Hill Climbing have the best false discovery rate (0.68) and true positive rate (0.56), respectively. CONCLUSIONS: Machine learning–based approaches, including deep learning, have many advantages over traditional approaches, such as scalability, including a greater number of variables, and potentially being applied in a wide range of biomedical applications, such as genetics, if sufficient data are available. Furthermore, these models are more flexible than traditional models and are poised to positively affect many applications in the future. JMIR Publications 2023-01-17 /pmc/articles/PMC9890349/ /pubmed/36649070 http://dx.doi.org/10.2196/38266 Text en ©Pulakesh Upadhyaya, Kai Zhang, Can Li, Xiaoqian Jiang, Yejin Kim. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 17.01.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Review Upadhyaya, Pulakesh Zhang, Kai Li, Can Jiang, Xiaoqian Kim, Yejin Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title	Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title_full	Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title_fullStr	Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title_full_unstemmed	Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title_short	Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine
title_sort	scalable causal structure learning: scoping review of traditional and deep learning algorithms and new opportunities in biomedicine
topic	Review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890349/ https://www.ncbi.nlm.nih.gov/pubmed/36649070 http://dx.doi.org/10.2196/38266
work_keys_str_mv	AT upadhyayapulakesh scalablecausalstructurelearningscopingreviewoftraditionalanddeeplearningalgorithmsandnewopportunitiesinbiomedicine AT zhangkai scalablecausalstructurelearningscopingreviewoftraditionalanddeeplearningalgorithmsandnewopportunitiesinbiomedicine AT lican scalablecausalstructurelearningscopingreviewoftraditionalanddeeplearningalgorithmsandnewopportunitiesinbiomedicine AT jiangxiaoqian scalablecausalstructurelearningscopingreviewoftraditionalanddeeplearningalgorithmsandnewopportunitiesinbiomedicine AT kimyejin scalablecausalstructurelearningscopingreviewoftraditionalanddeeplearningalgorithmsandnewopportunitiesinbiomedicine

Scalable Causal Structure Learning: Scoping Review of Traditional and Deep Learning Algorithms and New Opportunities in Biomedicine

Ejemplares similares