Cargando…

Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards

BACKGROUND: Computer-coded verbal autopsy (CCVA) is a promising alternative to the standard approach of physician-certified verbal autopsy (PCVA), because of its high speed, low cost, and reliability. This study introduces a new CCVA technique and validates its performance using defined clinical dia...

Descripción completa

Detalles Bibliográficos
Autores principales: Flaxman, Abraham D, Vahdatpour, Alireza, Green, Sean, James, Spencer L, Murray, Christopher JL
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3160922/
https://www.ncbi.nlm.nih.gov/pubmed/21816105
http://dx.doi.org/10.1186/1478-7954-9-29
_version_ 1782210606363836416
author Flaxman, Abraham D
Vahdatpour, Alireza
Green, Sean
James, Spencer L
Murray, Christopher JL
author_facet Flaxman, Abraham D
Vahdatpour, Alireza
Green, Sean
James, Spencer L
Murray, Christopher JL
author_sort Flaxman, Abraham D
collection PubMed
description BACKGROUND: Computer-coded verbal autopsy (CCVA) is a promising alternative to the standard approach of physician-certified verbal autopsy (PCVA), because of its high speed, low cost, and reliability. This study introduces a new CCVA technique and validates its performance using defined clinical diagnostic criteria as a gold standard for a multisite sample of 12,542 verbal autopsies (VAs). METHODS: The Random Forest (RF) Method from machine learning (ML) was adapted to predict cause of death by training random forests to distinguish between each pair of causes, and then combining the results through a novel ranking technique. We assessed quality of the new method at the individual level using chance-corrected concordance and at the population level using cause-specific mortality fraction (CSMF) accuracy as well as linear regression. We also compared the quality of RF to PCVA for all of these metrics. We performed this analysis separately for adult, child, and neonatal VAs. We also assessed the variation in performance with and without household recall of health care experience (HCE). RESULTS: For all metrics, for all settings, RF was as good as or better than PCVA, with the exception of a nonsignificantly lower CSMF accuracy for neonates with HCE information. With HCE, the chance-corrected concordance of RF was 3.4 percentage points higher for adults, 3.2 percentage points higher for children, and 1.6 percentage points higher for neonates. The CSMF accuracy was 0.097 higher for adults, 0.097 higher for children, and 0.007 lower for neonates. Without HCE, the chance-corrected concordance of RF was 8.1 percentage points higher than PCVA for adults, 10.2 percentage points higher for children, and 5.9 percentage points higher for neonates. The CSMF accuracy was higher for RF by 0.102 for adults, 0.131 for children, and 0.025 for neonates. CONCLUSIONS: We found that our RF Method outperformed the PCVA method in terms of chance-corrected concordance and CSMF accuracy for adult and child VA with and without HCE and for neonatal VA without HCE. It is also preferable to PCVA in terms of time and cost. Therefore, we recommend it as the technique of choice for analyzing past and current verbal autopsies.
format Online
Article
Text
id pubmed-3160922
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31609222011-08-25 Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards Flaxman, Abraham D Vahdatpour, Alireza Green, Sean James, Spencer L Murray, Christopher JL Popul Health Metr Research BACKGROUND: Computer-coded verbal autopsy (CCVA) is a promising alternative to the standard approach of physician-certified verbal autopsy (PCVA), because of its high speed, low cost, and reliability. This study introduces a new CCVA technique and validates its performance using defined clinical diagnostic criteria as a gold standard for a multisite sample of 12,542 verbal autopsies (VAs). METHODS: The Random Forest (RF) Method from machine learning (ML) was adapted to predict cause of death by training random forests to distinguish between each pair of causes, and then combining the results through a novel ranking technique. We assessed quality of the new method at the individual level using chance-corrected concordance and at the population level using cause-specific mortality fraction (CSMF) accuracy as well as linear regression. We also compared the quality of RF to PCVA for all of these metrics. We performed this analysis separately for adult, child, and neonatal VAs. We also assessed the variation in performance with and without household recall of health care experience (HCE). RESULTS: For all metrics, for all settings, RF was as good as or better than PCVA, with the exception of a nonsignificantly lower CSMF accuracy for neonates with HCE information. With HCE, the chance-corrected concordance of RF was 3.4 percentage points higher for adults, 3.2 percentage points higher for children, and 1.6 percentage points higher for neonates. The CSMF accuracy was 0.097 higher for adults, 0.097 higher for children, and 0.007 lower for neonates. Without HCE, the chance-corrected concordance of RF was 8.1 percentage points higher than PCVA for adults, 10.2 percentage points higher for children, and 5.9 percentage points higher for neonates. The CSMF accuracy was higher for RF by 0.102 for adults, 0.131 for children, and 0.025 for neonates. CONCLUSIONS: We found that our RF Method outperformed the PCVA method in terms of chance-corrected concordance and CSMF accuracy for adult and child VA with and without HCE and for neonatal VA without HCE. It is also preferable to PCVA in terms of time and cost. Therefore, we recommend it as the technique of choice for analyzing past and current verbal autopsies. BioMed Central 2011-08-04 /pmc/articles/PMC3160922/ /pubmed/21816105 http://dx.doi.org/10.1186/1478-7954-9-29 Text en Copyright ©2011 Flaxman et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Flaxman, Abraham D
Vahdatpour, Alireza
Green, Sean
James, Spencer L
Murray, Christopher JL
Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title_full Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title_fullStr Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title_full_unstemmed Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title_short Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
title_sort random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3160922/
https://www.ncbi.nlm.nih.gov/pubmed/21816105
http://dx.doi.org/10.1186/1478-7954-9-29
work_keys_str_mv AT flaxmanabrahamd randomforestsforverbalautopsyanalysismultisitevalidationstudyusingclinicaldiagnosticgoldstandards
AT vahdatpouralireza randomforestsforverbalautopsyanalysismultisitevalidationstudyusingclinicaldiagnosticgoldstandards
AT greensean randomforestsforverbalautopsyanalysismultisitevalidationstudyusingclinicaldiagnosticgoldstandards
AT jamesspencerl randomforestsforverbalautopsyanalysismultisitevalidationstudyusingclinicaldiagnosticgoldstandards
AT murraychristopherjl randomforestsforverbalautopsyanalysismultisitevalidationstudyusingclinicaldiagnosticgoldstandards