Cargando…

PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data

BACKGROUND: Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers. This makes clustering challenging. Mixtures are versatile and powerful statistical models which perform robustly for cl...

Descripción completa

Detalles Bibliográficos
Autores principales: Georgi, Benjamin, Costa, Ivan Gesteira, Schliep, Alexander
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2823712/
https://www.ncbi.nlm.nih.gov/pubmed/20053276
http://dx.doi.org/10.1186/1471-2105-11-9
_version_ 1782177669616500736
author Georgi, Benjamin
Costa, Ivan Gesteira
Schliep, Alexander
author_facet Georgi, Benjamin
Costa, Ivan Gesteira
Schliep, Alexander
author_sort Georgi, Benjamin
collection PubMed
description BACKGROUND: Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers. This makes clustering challenging. Mixtures are versatile and powerful statistical models which perform robustly for clustering in the presence of noise and have been successfully applied in a wide range of applications. RESULTS: PyMix - the Python mixture package implements algorithms and data structures for clustering with basic and advanced mixture models. The advanced models include context-specific independence mixtures, mixtures of dependence trees and semi-supervised learning. PyMix is licenced under the GNU General Public licence (GPL). PyMix has been successfully used for the analysis of biological sequence, complex disease and gene expression data. CONCLUSIONS: PyMix is a useful tool for cluster analysis of biological data. Due to the general nature of the framework, PyMix can be applied to a wide range of applications and data sets.
format Text
id pubmed-2823712
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28237122010-02-18 PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data Georgi, Benjamin Costa, Ivan Gesteira Schliep, Alexander BMC Bioinformatics Software BACKGROUND: Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers. This makes clustering challenging. Mixtures are versatile and powerful statistical models which perform robustly for clustering in the presence of noise and have been successfully applied in a wide range of applications. RESULTS: PyMix - the Python mixture package implements algorithms and data structures for clustering with basic and advanced mixture models. The advanced models include context-specific independence mixtures, mixtures of dependence trees and semi-supervised learning. PyMix is licenced under the GNU General Public licence (GPL). PyMix has been successfully used for the analysis of biological sequence, complex disease and gene expression data. CONCLUSIONS: PyMix is a useful tool for cluster analysis of biological data. Due to the general nature of the framework, PyMix can be applied to a wide range of applications and data sets. BioMed Central 2010-01-06 /pmc/articles/PMC2823712/ /pubmed/20053276 http://dx.doi.org/10.1186/1471-2105-11-9 Text en Copyright ©2010 Georgi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Georgi, Benjamin
Costa, Ivan Gesteira
Schliep, Alexander
PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title_full PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title_fullStr PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title_full_unstemmed PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title_short PyMix - The Python mixture package - a tool for clustering of heterogeneous biological data
title_sort pymix - the python mixture package - a tool for clustering of heterogeneous biological data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2823712/
https://www.ncbi.nlm.nih.gov/pubmed/20053276
http://dx.doi.org/10.1186/1471-2105-11-9
work_keys_str_mv AT georgibenjamin pymixthepythonmixturepackageatoolforclusteringofheterogeneousbiologicaldata
AT costaivangesteira pymixthepythonmixturepackageatoolforclusteringofheterogeneousbiologicaldata
AT schliepalexander pymixthepythonmixturepackageatoolforclusteringofheterogeneousbiologicaldata