Cargando…

A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays

BACKGROUND: Affymetrix SNP arrays can interrogate thousands of SNPs at the same time. This allows us to look at the genomic content of cancer cells and to investigate the underlying events leading to cancer. Genomic copy-numbers are today routinely derived from SNP array data, but the proposed algor...

Descripción completa

Detalles Bibliográficos
Autores principales: Lamy, Philippe, Andersen, Claus L, Dyrskjot, Lars, Torring, Niels, Wiuf, Carsten
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2206057/
https://www.ncbi.nlm.nih.gov/pubmed/17996079
http://dx.doi.org/10.1186/1471-2105-8-434
_version_ 1782148439889412096
author Lamy, Philippe
Andersen, Claus L
Dyrskjot, Lars
Torring, Niels
Wiuf, Carsten
author_facet Lamy, Philippe
Andersen, Claus L
Dyrskjot, Lars
Torring, Niels
Wiuf, Carsten
author_sort Lamy, Philippe
collection PubMed
description BACKGROUND: Affymetrix SNP arrays can interrogate thousands of SNPs at the same time. This allows us to look at the genomic content of cancer cells and to investigate the underlying events leading to cancer. Genomic copy-numbers are today routinely derived from SNP array data, but the proposed algorithms for this task most often disregard the genotype information available from germline cells in paired germline-tumour samples. Including this information may deepen our understanding of the "true" biological situation e.g. by enabling analysis of allele specific copy-numbers. Here we rely on matched germline-tumour samples and have developed a Hidden Markov Model (HMM) to estimate allelic copy-number changes in tumour cells. Further with this approach we are able to estimate the proportion of normal cells in the tumour (mixture proportion). RESULTS: We show that our method is able to recover the underlying copy-number changes in simulated data sets with high accuracy (above 97.71%). Moreover, although the known copy-numbers could be well recovered in simulated cancer samples with more than 70% cancer cells (and less than 30% normal cells), we demonstrate that including the mixture proportion in the HMM increases the accuracy of the method. Finally, the method is tested on HapMap samples and on bladder and prostate cancer samples. CONCLUSION: The HMM method developed here uses the genotype calls of germline DNA and the allelic SNP intensities from the tumour DNA to estimate allelic copy-numbers (including changes) in the tumour. It differentiates between different events like uniparental disomy and allelic imbalances. Moreover, the HMM can estimate the mixture proportion, and thus inform about the purity of the tumour sample.
format Text
id pubmed-2206057
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22060572008-01-18 A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays Lamy, Philippe Andersen, Claus L Dyrskjot, Lars Torring, Niels Wiuf, Carsten BMC Bioinformatics Methodology Article BACKGROUND: Affymetrix SNP arrays can interrogate thousands of SNPs at the same time. This allows us to look at the genomic content of cancer cells and to investigate the underlying events leading to cancer. Genomic copy-numbers are today routinely derived from SNP array data, but the proposed algorithms for this task most often disregard the genotype information available from germline cells in paired germline-tumour samples. Including this information may deepen our understanding of the "true" biological situation e.g. by enabling analysis of allele specific copy-numbers. Here we rely on matched germline-tumour samples and have developed a Hidden Markov Model (HMM) to estimate allelic copy-number changes in tumour cells. Further with this approach we are able to estimate the proportion of normal cells in the tumour (mixture proportion). RESULTS: We show that our method is able to recover the underlying copy-number changes in simulated data sets with high accuracy (above 97.71%). Moreover, although the known copy-numbers could be well recovered in simulated cancer samples with more than 70% cancer cells (and less than 30% normal cells), we demonstrate that including the mixture proportion in the HMM increases the accuracy of the method. Finally, the method is tested on HapMap samples and on bladder and prostate cancer samples. CONCLUSION: The HMM method developed here uses the genotype calls of germline DNA and the allelic SNP intensities from the tumour DNA to estimate allelic copy-numbers (including changes) in the tumour. It differentiates between different events like uniparental disomy and allelic imbalances. Moreover, the HMM can estimate the mixture proportion, and thus inform about the purity of the tumour sample. BioMed Central 2007-11-09 /pmc/articles/PMC2206057/ /pubmed/17996079 http://dx.doi.org/10.1186/1471-2105-8-434 Text en Copyright © 2007 Lamy et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Lamy, Philippe
Andersen, Claus L
Dyrskjot, Lars
Torring, Niels
Wiuf, Carsten
A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title_full A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title_fullStr A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title_full_unstemmed A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title_short A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays
title_sort hidden markov model to estimate population mixture and allelic copy-numbers in cancers using affymetrix snp arrays
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2206057/
https://www.ncbi.nlm.nih.gov/pubmed/17996079
http://dx.doi.org/10.1186/1471-2105-8-434
work_keys_str_mv AT lamyphilippe ahiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT andersenclausl ahiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT dyrskjotlars ahiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT torringniels ahiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT wiufcarsten ahiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT lamyphilippe hiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT andersenclausl hiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT dyrskjotlars hiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT torringniels hiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays
AT wiufcarsten hiddenmarkovmodeltoestimatepopulationmixtureandalleliccopynumbersincancersusingaffymetrixsnparrays