Cargando…

Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources

BACKGROUND: Medical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics...

Descripción completa

Detalles Bibliográficos
Autores principales: Hepp, Tobias, Zierk, Jakob, Rauh, Manfred, Metzler, Markus, Mayr, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7666475/
https://www.ncbi.nlm.nih.gov/pubmed/33187469
http://dx.doi.org/10.1186/s12859-020-03853-3
_version_ 1783610135101308928
author Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Mayr, Andreas
author_facet Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Mayr, Andreas
author_sort Hepp, Tobias
collection PubMed
description BACKGROUND: Medical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics of different analytes directly in a prospective setting or the extraction of independent distributions from contaminated data sources, e.g. data with latent heterogeneity due to unlabeled pathologic cases. In this article, we propose a new method to estimate indirect reference limits with non-linear dependencies on covariates from contaminated datasets by combining the framework of mixture models and distributional regression. RESULTS: Simulation results based on mixtures of Gaussian and gamma distributions suggest accurate approximation of the true quantiles that improves with increasing sample size and decreasing overlap between the mixture components. Due to the high flexibility of the framework, initialization of the algorithm requires careful considerations regarding appropriate starting weights. Estimated quantiles from the extracted distribution of healthy hemoglobin concentration in boys and girls provide clinically useful pediatric reference limits similar to solutions obtained using different approaches which require more samples and are computationally more expensive. CONCLUSIONS: Latent class distributional regression models represent the first method to estimate indirect non-linear reference limits from a single model fit, but the general scope of applications can be extended to other scenarios with latent heterogeneity.
format Online
Article
Text
id pubmed-7666475
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76664752020-11-16 Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources Hepp, Tobias Zierk, Jakob Rauh, Manfred Metzler, Markus Mayr, Andreas BMC Bioinformatics Methodology Article BACKGROUND: Medical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics of different analytes directly in a prospective setting or the extraction of independent distributions from contaminated data sources, e.g. data with latent heterogeneity due to unlabeled pathologic cases. In this article, we propose a new method to estimate indirect reference limits with non-linear dependencies on covariates from contaminated datasets by combining the framework of mixture models and distributional regression. RESULTS: Simulation results based on mixtures of Gaussian and gamma distributions suggest accurate approximation of the true quantiles that improves with increasing sample size and decreasing overlap between the mixture components. Due to the high flexibility of the framework, initialization of the algorithm requires careful considerations regarding appropriate starting weights. Estimated quantiles from the extracted distribution of healthy hemoglobin concentration in boys and girls provide clinically useful pediatric reference limits similar to solutions obtained using different approaches which require more samples and are computationally more expensive. CONCLUSIONS: Latent class distributional regression models represent the first method to estimate indirect non-linear reference limits from a single model fit, but the general scope of applications can be extended to other scenarios with latent heterogeneity. BioMed Central 2020-11-13 /pmc/articles/PMC7666475/ /pubmed/33187469 http://dx.doi.org/10.1186/s12859-020-03853-3 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Hepp, Tobias
Zierk, Jakob
Rauh, Manfred
Metzler, Markus
Mayr, Andreas
Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title_full Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title_fullStr Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title_full_unstemmed Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title_short Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
title_sort latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7666475/
https://www.ncbi.nlm.nih.gov/pubmed/33187469
http://dx.doi.org/10.1186/s12859-020-03853-3
work_keys_str_mv AT hepptobias latentclassdistributionalregressionfortheestimationofnonlinearreferencelimitsfromcontaminateddatasources
AT zierkjakob latentclassdistributionalregressionfortheestimationofnonlinearreferencelimitsfromcontaminateddatasources
AT rauhmanfred latentclassdistributionalregressionfortheestimationofnonlinearreferencelimitsfromcontaminateddatasources
AT metzlermarkus latentclassdistributionalregressionfortheestimationofnonlinearreferencelimitsfromcontaminateddatasources
AT mayrandreas latentclassdistributionalregressionfortheestimationofnonlinearreferencelimitsfromcontaminateddatasources