Cargando…

Mixture density networks for the indirect estimation of reference intervals

BACKGROUND: Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to...

Descripción completa

Detalles Bibliográficos
Autores principales: Hepp, Tobias, Zierk, Jakob, Rauh, Manfred, Metzler, Markus, Seitz, Sarem
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336034/
https://www.ncbi.nlm.nih.gov/pubmed/35906555
http://dx.doi.org/10.1186/s12859-022-04846-0
_version_ 1784759459328819200
author Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Seitz, Sarem
author_facet Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Seitz, Sarem
author_sort Hepp, Tobias
collection PubMed
description BACKGROUND: Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step. RESULTS: Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data. CONCLUSIONS: Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications.
format Online
Article
Text
id pubmed-9336034
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93360342022-07-30 Mixture density networks for the indirect estimation of reference intervals Hepp, Tobias Zierk, Jakob Rauh, Manfred Metzler, Markus Seitz, Sarem BMC Bioinformatics Research BACKGROUND: Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step. RESULTS: Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data. CONCLUSIONS: Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications. BioMed Central 2022-07-29 /pmc/articles/PMC9336034/ /pubmed/35906555 http://dx.doi.org/10.1186/s12859-022-04846-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Seitz, Sarem
Mixture density networks for the indirect estimation of reference intervals
title Mixture density networks for the indirect estimation of reference intervals
title_full Mixture density networks for the indirect estimation of reference intervals
title_fullStr Mixture density networks for the indirect estimation of reference intervals
title_full_unstemmed Mixture density networks for the indirect estimation of reference intervals
title_short Mixture density networks for the indirect estimation of reference intervals
title_sort mixture density networks for the indirect estimation of reference intervals
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336034/
https://www.ncbi.nlm.nih.gov/pubmed/35906555
http://dx.doi.org/10.1186/s12859-022-04846-0
work_keys_str_mv AT hepptobias mixturedensitynetworksfortheindirectestimationofreferenceintervals
AT zierkjakob mixturedensitynetworksfortheindirectestimationofreferenceintervals
AT rauhmanfred mixturedensitynetworksfortheindirectestimationofreferenceintervals
AT metzlermarkus mixturedensitynetworksfortheindirectestimationofreferenceintervals
AT seitzsarem mixturedensitynetworksfortheindirectestimationofreferenceintervals