Cargando…

Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures

Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single...

Descripción completa

Detalles Bibliográficos
Autores principales: Lerch, Florian, Ultsch, Alfred, Lötsch, Jörn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6971287/
https://www.ncbi.nlm.nih.gov/pubmed/31959878
http://dx.doi.org/10.1038/s41598-020-57432-w
_version_ 1783489693641342976
author Lerch, Florian
Ultsch, Alfred
Lötsch, Jörn
author_facet Lerch, Florian
Ultsch, Alfred
Lötsch, Jörn
author_sort Lerch, Florian
collection PubMed
description Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single mode. Gaussian mixtures play an important role in the multimodal distribution of one-dimensional data. However, although fitting of Gaussian mixture models (GMM) is often aimed at obtaining the separate modes composing the mixture, current technical implementations, often using the Expectation Maximization (EM) algorithm, are not optimized for this task. This occasionally results in poorly separated modes that are unsuitable for determining a distinguishable group structure in the data. Here, we introduce “Distribution Optimization” an evolutionary algorithm to GMM fitting that uses an adjustable error function that is based on chi-square statistics and the probability density. The algorithm can be directly targeted at the separation of the modes of the mixture by employing additional criterion for the degree by which single modes overlap. The obtained GMM fits were comparable with those obtained with classical EM based fits, except for data sets where the EM algorithm produced unsatisfactory results with overlapping Gaussian modes. There, the proposed algorithm successfully separated the modes, providing a basis for meaningful group separation while fitting the data satisfactorily. Through its optimization toward mode separation, the evolutionary algorithm proofed particularly suitable basis for group separation in multimodally distributed data, outperforming alternative EM based methods.
format Online
Article
Text
id pubmed-6971287
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-69712872020-01-27 Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures Lerch, Florian Ultsch, Alfred Lötsch, Jörn Sci Rep Article Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single mode. Gaussian mixtures play an important role in the multimodal distribution of one-dimensional data. However, although fitting of Gaussian mixture models (GMM) is often aimed at obtaining the separate modes composing the mixture, current technical implementations, often using the Expectation Maximization (EM) algorithm, are not optimized for this task. This occasionally results in poorly separated modes that are unsuitable for determining a distinguishable group structure in the data. Here, we introduce “Distribution Optimization” an evolutionary algorithm to GMM fitting that uses an adjustable error function that is based on chi-square statistics and the probability density. The algorithm can be directly targeted at the separation of the modes of the mixture by employing additional criterion for the degree by which single modes overlap. The obtained GMM fits were comparable with those obtained with classical EM based fits, except for data sets where the EM algorithm produced unsatisfactory results with overlapping Gaussian modes. There, the proposed algorithm successfully separated the modes, providing a basis for meaningful group separation while fitting the data satisfactorily. Through its optimization toward mode separation, the evolutionary algorithm proofed particularly suitable basis for group separation in multimodally distributed data, outperforming alternative EM based methods. Nature Publishing Group UK 2020-01-20 /pmc/articles/PMC6971287/ /pubmed/31959878 http://dx.doi.org/10.1038/s41598-020-57432-w Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Lerch, Florian
Ultsch, Alfred
Lötsch, Jörn
Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title_full Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title_fullStr Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title_full_unstemmed Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title_short Distribution Optimization: An evolutionary algorithm to separate Gaussian mixtures
title_sort distribution optimization: an evolutionary algorithm to separate gaussian mixtures
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6971287/
https://www.ncbi.nlm.nih.gov/pubmed/31959878
http://dx.doi.org/10.1038/s41598-020-57432-w
work_keys_str_mv AT lerchflorian distributionoptimizationanevolutionaryalgorithmtoseparategaussianmixtures
AT ultschalfred distributionoptimizationanevolutionaryalgorithmtoseparategaussianmixtures
AT lotschjorn distributionoptimizationanevolutionaryalgorithmtoseparategaussianmixtures