Cargando…
Bayesian clustering and feature selection for cancer tissue samples
BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to iden...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679022/ https://www.ncbi.nlm.nih.gov/pubmed/19296858 http://dx.doi.org/10.1186/1471-2105-10-90 |
_version_ | 1782166869840494592 |
---|---|
author | Marttinen, Pekka Myllykangas, Samuel Corander, Jukka |
author_facet | Marttinen, Pekka Myllykangas, Samuel Corander, Jukka |
author_sort | Marttinen, Pekka |
collection | PubMed |
description | BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as independent and principal component analyses, hierarchical clustering, as well as model-based clustering using maximum likelihood estimation for latent class models. RESULTS: While purely algorithmic methods are usually easily applicable, their suboptimal performance and limitations in making formal inference have been thoroughly discussed in the statistical literature. Here we introduce a Bayesian model-based approach to simultaneous identification of underlying tissue groups and the informative amplifications. The model-based approach provides the possibility of using formal inference to determine the number of groups from the data, in contrast to the ad hoc methods often exploited for similar purposes. The model also automatically recognizes the chromosomal areas that are relevant for the clustering. CONCLUSION: Validatory analyses of simulated data and a large database of DNA copy number amplifications in human neoplasms are used to illustrate the potential of our approach. Our software implementation BASTA for performing Bayesian statistical tissue profiling is freely available for academic purposes at |
format | Text |
id | pubmed-2679022 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26790222009-05-08 Bayesian clustering and feature selection for cancer tissue samples Marttinen, Pekka Myllykangas, Samuel Corander, Jukka BMC Bioinformatics Research Article BACKGROUND: The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as independent and principal component analyses, hierarchical clustering, as well as model-based clustering using maximum likelihood estimation for latent class models. RESULTS: While purely algorithmic methods are usually easily applicable, their suboptimal performance and limitations in making formal inference have been thoroughly discussed in the statistical literature. Here we introduce a Bayesian model-based approach to simultaneous identification of underlying tissue groups and the informative amplifications. The model-based approach provides the possibility of using formal inference to determine the number of groups from the data, in contrast to the ad hoc methods often exploited for similar purposes. The model also automatically recognizes the chromosomal areas that are relevant for the clustering. CONCLUSION: Validatory analyses of simulated data and a large database of DNA copy number amplifications in human neoplasms are used to illustrate the potential of our approach. Our software implementation BASTA for performing Bayesian statistical tissue profiling is freely available for academic purposes at BioMed Central 2009-03-18 /pmc/articles/PMC2679022/ /pubmed/19296858 http://dx.doi.org/10.1186/1471-2105-10-90 Text en Copyright © 2009 Marttinen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Marttinen, Pekka Myllykangas, Samuel Corander, Jukka Bayesian clustering and feature selection for cancer tissue samples |
title | Bayesian clustering and feature selection for cancer tissue samples |
title_full | Bayesian clustering and feature selection for cancer tissue samples |
title_fullStr | Bayesian clustering and feature selection for cancer tissue samples |
title_full_unstemmed | Bayesian clustering and feature selection for cancer tissue samples |
title_short | Bayesian clustering and feature selection for cancer tissue samples |
title_sort | bayesian clustering and feature selection for cancer tissue samples |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679022/ https://www.ncbi.nlm.nih.gov/pubmed/19296858 http://dx.doi.org/10.1186/1471-2105-10-90 |
work_keys_str_mv | AT marttinenpekka bayesianclusteringandfeatureselectionforcancertissuesamples AT myllykangassamuel bayesianclusteringandfeatureselectionforcancertissuesamples AT coranderjukka bayesianclusteringandfeatureselectionforcancertissuesamples |