Cargando…
A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data
When a dataset is imbalanced, the prediction of the scarcely-sampled subpopulation can be over-influenced by the population contributing to the majority of the data. The aim of this study was to develop a Bayesian modelling approach with balancing informative prior so that the influence of imbalance...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829197/ https://www.ncbi.nlm.nih.gov/pubmed/27070549 http://dx.doi.org/10.1371/journal.pone.0152700 |
_version_ | 1782426714214760448 |
---|---|
author | Klein, Kerenaftali Hennig, Stefanie Paul, Sanjoy Ketan |
author_facet | Klein, Kerenaftali Hennig, Stefanie Paul, Sanjoy Ketan |
author_sort | Klein, Kerenaftali |
collection | PubMed |
description | When a dataset is imbalanced, the prediction of the scarcely-sampled subpopulation can be over-influenced by the population contributing to the majority of the data. The aim of this study was to develop a Bayesian modelling approach with balancing informative prior so that the influence of imbalance to the overall prediction could be minimised. The new approach was developed in order to weigh the data in favour of the smaller subset(s). The method was assessed in terms of bias and precision in predicting model parameter estimates of simulated datasets. Moreover, the method was evaluated in predicting optimal dose levels of tobramycin for various age groups in a motivating example. The bias estimates using the balancing informative prior approach were smaller than those generated using the conventional approach which was without the consideration for the imbalance in the datasets. The precision estimates were also superior. The method was further evaluated in a motivating example of optimal dosage prediction of tobramycin. The resulting predictions also agreed well with what had been reported in the literature. The proposed Bayesian balancing informative prior approach has shown a real potential to adequately weigh the data in favour of smaller subset(s) of data to generate robust prediction models. |
format | Online Article Text |
id | pubmed-4829197 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-48291972016-04-22 A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data Klein, Kerenaftali Hennig, Stefanie Paul, Sanjoy Ketan PLoS One Research Article When a dataset is imbalanced, the prediction of the scarcely-sampled subpopulation can be over-influenced by the population contributing to the majority of the data. The aim of this study was to develop a Bayesian modelling approach with balancing informative prior so that the influence of imbalance to the overall prediction could be minimised. The new approach was developed in order to weigh the data in favour of the smaller subset(s). The method was assessed in terms of bias and precision in predicting model parameter estimates of simulated datasets. Moreover, the method was evaluated in predicting optimal dose levels of tobramycin for various age groups in a motivating example. The bias estimates using the balancing informative prior approach were smaller than those generated using the conventional approach which was without the consideration for the imbalance in the datasets. The precision estimates were also superior. The method was further evaluated in a motivating example of optimal dosage prediction of tobramycin. The resulting predictions also agreed well with what had been reported in the literature. The proposed Bayesian balancing informative prior approach has shown a real potential to adequately weigh the data in favour of smaller subset(s) of data to generate robust prediction models. Public Library of Science 2016-04-12 /pmc/articles/PMC4829197/ /pubmed/27070549 http://dx.doi.org/10.1371/journal.pone.0152700 Text en © 2016 Klein et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Klein, Kerenaftali Hennig, Stefanie Paul, Sanjoy Ketan A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title | A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title_full | A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title_fullStr | A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title_full_unstemmed | A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title_short | A Bayesian Modelling Approach with Balancing Informative Prior for Analysing Imbalanced Data |
title_sort | bayesian modelling approach with balancing informative prior for analysing imbalanced data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829197/ https://www.ncbi.nlm.nih.gov/pubmed/27070549 http://dx.doi.org/10.1371/journal.pone.0152700 |
work_keys_str_mv | AT kleinkerenaftali abayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata AT hennigstefanie abayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata AT paulsanjoyketan abayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata AT kleinkerenaftali bayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata AT hennigstefanie bayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata AT paulsanjoyketan bayesianmodellingapproachwithbalancinginformativepriorforanalysingimbalanceddata |