Cargando…

Regularized Bayesian transfer learning for population-level etiological distributions

Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsy) of a deceased individual, which are then aggregated to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Datta, Abhirup, Fiksel, Jacob, Amouzou, Agbessi, Zeger, Scott L
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2020
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8511959/ https://www.ncbi.nlm.nih.gov/pubmed/32040180 http://dx.doi.org/10.1093/biostatistics/kxaa001

_version_	1784582874550239232
author	Datta, Abhirup Fiksel, Jacob Amouzou, Agbessi Zeger, Scott L
author_facet	Datta, Abhirup Fiksel, Jacob Amouzou, Agbessi Zeger, Scott L
author_sort	Datta, Abhirup
collection	PubMed
description	Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsy) of a deceased individual, which are then aggregated to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if CCVA is trained on non-local training data different from the local population of interest. This problem is a special case of transfer learning, i.e., improving classification within a target domain (e.g., a particular population) with the classifier trained in a source-domain. Most transfer learning approaches concern individual-level (e.g., a person’s) classification. Social and health scientists such as epidemiologists are often more interested with understanding etiological distributions at the population-level. The sample sizes of their data sets are typically orders of magnitude smaller than those used for common transfer learning applications like image classification, document identification, etc. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain, using any baseline classifier trained on source-domain, and a small labeled target-domain dataset. To address small sample sizes, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target-domain data or when the baseline classifier is perfectly accurate, our transfer learning agrees with direct aggregation of predictions from the baseline classifier, thereby subsuming the default practice as a special case. We then extend our approach to use an ensemble of baseline classifiers producing an unified estimate. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. We present data analyses demonstrating the utility of our approach.
format	Online Article Text
id	pubmed-8511959
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-85119592021-10-13 Regularized Bayesian transfer learning for population-level etiological distributions Datta, Abhirup Fiksel, Jacob Amouzou, Agbessi Zeger, Scott L Biostatistics Articles Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsy) of a deceased individual, which are then aggregated to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if CCVA is trained on non-local training data different from the local population of interest. This problem is a special case of transfer learning, i.e., improving classification within a target domain (e.g., a particular population) with the classifier trained in a source-domain. Most transfer learning approaches concern individual-level (e.g., a person’s) classification. Social and health scientists such as epidemiologists are often more interested with understanding etiological distributions at the population-level. The sample sizes of their data sets are typically orders of magnitude smaller than those used for common transfer learning applications like image classification, document identification, etc. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain, using any baseline classifier trained on source-domain, and a small labeled target-domain dataset. To address small sample sizes, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target-domain data or when the baseline classifier is perfectly accurate, our transfer learning agrees with direct aggregation of predictions from the baseline classifier, thereby subsuming the default practice as a special case. We then extend our approach to use an ensemble of baseline classifiers producing an unified estimate. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. We present data analyses demonstrating the utility of our approach. Oxford University Press 2020-02-10 /pmc/articles/PMC8511959/ /pubmed/32040180 http://dx.doi.org/10.1093/biostatistics/kxaa001 Text en © The Author 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Articles Datta, Abhirup Fiksel, Jacob Amouzou, Agbessi Zeger, Scott L Regularized Bayesian transfer learning for population-level etiological distributions
title	Regularized Bayesian transfer learning for population-level etiological distributions
title_full	Regularized Bayesian transfer learning for population-level etiological distributions
title_fullStr	Regularized Bayesian transfer learning for population-level etiological distributions
title_full_unstemmed	Regularized Bayesian transfer learning for population-level etiological distributions
title_short	Regularized Bayesian transfer learning for population-level etiological distributions
title_sort	regularized bayesian transfer learning for population-level etiological distributions
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8511959/ https://www.ncbi.nlm.nih.gov/pubmed/32040180 http://dx.doi.org/10.1093/biostatistics/kxaa001
work_keys_str_mv	AT dattaabhirup regularizedbayesiantransferlearningforpopulationleveletiologicaldistributions AT fikseljacob regularizedbayesiantransferlearningforpopulationleveletiologicaldistributions AT amouzouagbessi regularizedbayesiantransferlearningforpopulationleveletiologicaldistributions AT zegerscottl regularizedbayesiantransferlearningforpopulationleveletiologicaldistributions

Regularized Bayesian transfer learning for population-level etiological distributions

Ejemplares similares