Cargando…

Bayesian clustering and feature selection for cancer tissue samples

BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Marttinen, Pekka, Myllykangas, Samuel, Corander, Jukka
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679022/
https://www.ncbi.nlm.nih.gov/pubmed/19296858
http://dx.doi.org/10.1186/1471-2105-10-90
_version_ 1782166869840494592
author Marttinen, Pekka
Myllykangas, Samuel
Corander, Jukka
author_facet Marttinen, Pekka
Myllykangas, Samuel
Corander, Jukka
author_sort Marttinen, Pekka
collection PubMed
description BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as independent and principal component analyses, hierarchical clustering, as well as model-based clustering using maximum likelihood estimation for latent class models. RESULTS: While purely algorithmic methods are usually easily applicable, their suboptimal performance and limitations in making formal inference have been thoroughly discussed in the statistical literature. Here we introduce a Bayesian model-based approach to simultaneous identification of underlying tissue groups and the informative amplifications. The model-based approach provides the possibility of using formal inference to determine the number of groups from the data, in contrast to the ad hoc methods often exploited for similar purposes. The model also automatically recognizes the chromosomal areas that are relevant for the clustering. CONCLUSION: Validatory analyses of simulated data and a large database of DNA copy number amplifications in human neoplasms are used to illustrate the potential of our approach. Our software implementation BASTA for performing Bayesian statistical tissue profiling is freely available for academic purposes at
format Text
id pubmed-2679022
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26790222009-05-08 Bayesian clustering and feature selection for cancer tissue samples Marttinen, Pekka Myllykangas, Samuel Corander, Jukka BMC Bioinformatics Research Article BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as independent and principal component analyses, hierarchical clustering, as well as model-based clustering using maximum likelihood estimation for latent class models. RESULTS: While purely algorithmic methods are usually easily applicable, their suboptimal performance and limitations in making formal inference have been thoroughly discussed in the statistical literature. Here we introduce a Bayesian model-based approach to simultaneous identification of underlying tissue groups and the informative amplifications. The model-based approach provides the possibility of using formal inference to determine the number of groups from the data, in contrast to the ad hoc methods often exploited for similar purposes. The model also automatically recognizes the chromosomal areas that are relevant for the clustering. CONCLUSION: Validatory analyses of simulated data and a large database of DNA copy number amplifications in human neoplasms are used to illustrate the potential of our approach. Our software implementation BASTA for performing Bayesian statistical tissue profiling is freely available for academic purposes at BioMed Central 2009-03-18 /pmc/articles/PMC2679022/ /pubmed/19296858 http://dx.doi.org/10.1186/1471-2105-10-90 Text en Copyright © 2009 Marttinen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Marttinen, Pekka
Myllykangas, Samuel
Corander, Jukka
Bayesian clustering and feature selection for cancer tissue samples
title Bayesian clustering and feature selection for cancer tissue samples
title_full Bayesian clustering and feature selection for cancer tissue samples
title_fullStr Bayesian clustering and feature selection for cancer tissue samples
title_full_unstemmed Bayesian clustering and feature selection for cancer tissue samples
title_short Bayesian clustering and feature selection for cancer tissue samples
title_sort bayesian clustering and feature selection for cancer tissue samples
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679022/
https://www.ncbi.nlm.nih.gov/pubmed/19296858
http://dx.doi.org/10.1186/1471-2105-10-90
work_keys_str_mv AT marttinenpekka bayesianclusteringandfeatureselectionforcancertissuesamples
AT myllykangassamuel bayesianclusteringandfeatureselectionforcancertissuesamples
AT coranderjukka bayesianclusteringandfeatureselectionforcancertissuesamples