Cargando…

Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies

We discuss the evaluation of subsets of variables for the discriminative evidence they provide in multivariate mixture modeling for classification. The novel development of Bayesian classification analysis presented is partly motivated by problems of design and selection of variables in biomolecular...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Lin, Chan, Cliburn, West, Mike
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4679067/
https://www.ncbi.nlm.nih.gov/pubmed/26040910
http://dx.doi.org/10.1093/biostatistics/kxv021
_version_ 1782405540755800064
author Lin, Lin
Chan, Cliburn
West, Mike
author_facet Lin, Lin
Chan, Cliburn
West, Mike
author_sort Lin, Lin
collection PubMed
description We discuss the evaluation of subsets of variables for the discriminative evidence they provide in multivariate mixture modeling for classification. The novel development of Bayesian classification analysis presented is partly motivated by problems of design and selection of variables in biomolecular studies, particularly involving widely used assays of large-scale single-cell data generated using flow cytometry technology. For such studies and for mixture modeling generally, we define discriminative analysis that overlays fitted mixture models using a natural measure of concordance between mixture component densities, and define an effective and computationally feasible method for assessing and prioritizing subsets of variables according to their roles in discrimination of one or more mixture components. We relate the new discriminative information measures to Bayesian classification probabilities and error rates, and exemplify their use in Bayesian analysis of Dirichlet process mixture models fitted via Markov chain Monte Carlo methods as well as using a novel Bayesian expectation–maximization algorithm. We present a series of theoretical and simulated data examples to fix concepts and exhibit the utility of the approach, and compare with prior approaches. We demonstrate application in the context of automatic classification and discriminative variable selection in high-throughput systems biology using large flow cytometry datasets.
format Online
Article
Text
id pubmed-4679067
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-46790672015-12-16 Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies Lin, Lin Chan, Cliburn West, Mike Biostatistics Articles We discuss the evaluation of subsets of variables for the discriminative evidence they provide in multivariate mixture modeling for classification. The novel development of Bayesian classification analysis presented is partly motivated by problems of design and selection of variables in biomolecular studies, particularly involving widely used assays of large-scale single-cell data generated using flow cytometry technology. For such studies and for mixture modeling generally, we define discriminative analysis that overlays fitted mixture models using a natural measure of concordance between mixture component densities, and define an effective and computationally feasible method for assessing and prioritizing subsets of variables according to their roles in discrimination of one or more mixture components. We relate the new discriminative information measures to Bayesian classification probabilities and error rates, and exemplify their use in Bayesian analysis of Dirichlet process mixture models fitted via Markov chain Monte Carlo methods as well as using a novel Bayesian expectation–maximization algorithm. We present a series of theoretical and simulated data examples to fix concepts and exhibit the utility of the approach, and compare with prior approaches. We demonstrate application in the context of automatic classification and discriminative variable selection in high-throughput systems biology using large flow cytometry datasets. Oxford University Press 2016-01 2015-06-03 /pmc/articles/PMC4679067/ /pubmed/26040910 http://dx.doi.org/10.1093/biostatistics/kxv021 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Lin, Lin
Chan, Cliburn
West, Mike
Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title_full Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title_fullStr Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title_full_unstemmed Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title_short Discriminative variable subsets in Bayesian classification with mixture models, with application in flow cytometry studies
title_sort discriminative variable subsets in bayesian classification with mixture models, with application in flow cytometry studies
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4679067/
https://www.ncbi.nlm.nih.gov/pubmed/26040910
http://dx.doi.org/10.1093/biostatistics/kxv021
work_keys_str_mv AT linlin discriminativevariablesubsetsinbayesianclassificationwithmixturemodelswithapplicationinflowcytometrystudies
AT chancliburn discriminativevariablesubsetsinbayesianclassificationwithmixturemodelswithapplicationinflowcytometrystudies
AT westmike discriminativevariablesubsetsinbayesianclassificationwithmixturemodelswithapplicationinflowcytometrystudies