Cargando…

A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation

BACKGROUND: Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrep...

Descripción completa

Detalles Bibliográficos
Autores principales: Falissard, Louis, Morgand, Claire, Roussel, Sylvie, Imbaud, Claire, Ghosn, Walid, Bounebache, Karim, Rey, Grégoire
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7218605/
https://www.ncbi.nlm.nih.gov/pubmed/32343252
http://dx.doi.org/10.2196/17125
_version_ 1783532833945419776
author Falissard, Louis
Morgand, Claire
Roussel, Sylvie
Imbaud, Claire
Ghosn, Walid
Bounebache, Karim
Rey, Grégoire
author_facet Falissard, Louis
Morgand, Claire
Roussel, Sylvie
Imbaud, Claire
Ghosn, Walid
Bounebache, Karim
Rey, Grégoire
author_sort Falissard, Louis
collection PubMed
description BACKGROUND: Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d’épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. OBJECTIVE: This article investigates the application of deep neural network methods to coding underlying causes of death. METHODS: The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject’s age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject’s underlying cause of death was then formulated as a predictive modelling problem. A deep neural network−based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach’s superiority was assessed via bootstrap. RESULTS: The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. CONCLUSIONS: This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general.
format Online
Article
Text
id pubmed-7218605
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-72186052020-05-18 A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation Falissard, Louis Morgand, Claire Roussel, Sylvie Imbaud, Claire Ghosn, Walid Bounebache, Karim Rey, Grégoire JMIR Med Inform Original Paper BACKGROUND: Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d’épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. OBJECTIVE: This article investigates the application of deep neural network methods to coding underlying causes of death. METHODS: The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject’s age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject’s underlying cause of death was then formulated as a predictive modelling problem. A deep neural network−based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach’s superiority was assessed via bootstrap. RESULTS: The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. CONCLUSIONS: This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general. JMIR Publications 2020-04-28 /pmc/articles/PMC7218605/ /pubmed/32343252 http://dx.doi.org/10.2196/17125 Text en ©Louis Falissard, Claire Morgand, Sylvie Roussel, Claire Imbaud, Walid Ghosn, Karim Bounebache, Grégoire Rey. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 28.04.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Falissard, Louis
Morgand, Claire
Roussel, Sylvie
Imbaud, Claire
Ghosn, Walid
Bounebache, Karim
Rey, Grégoire
A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title_full A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title_fullStr A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title_full_unstemmed A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title_short A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation
title_sort deep artificial neural network−based model for prediction of underlying cause of death from death certificates: algorithm development and validation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7218605/
https://www.ncbi.nlm.nih.gov/pubmed/32343252
http://dx.doi.org/10.2196/17125
work_keys_str_mv AT falissardlouis adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT morgandclaire adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT rousselsylvie adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT imbaudclaire adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT ghosnwalid adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT bounebachekarim adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT reygregoire adeepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT falissardlouis deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT morgandclaire deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT rousselsylvie deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT imbaudclaire deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT ghosnwalid deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT bounebachekarim deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation
AT reygregoire deepartificialneuralnetworkbasedmodelforpredictionofunderlyingcauseofdeathfromdeathcertificatesalgorithmdevelopmentandvalidation