Cargando…

Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism

Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10–15%....

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Rui, Vattathil, Selina, Scheet, Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148184/
https://www.ncbi.nlm.nih.gov/pubmed/25166618
http://dx.doi.org/10.1371/journal.pcbi.1003765
_version_ 1782332572639952896
author Xia, Rui
Vattathil, Selina
Scheet, Paul
author_facet Xia, Rui
Vattathil, Selina
Scheet, Paul
author_sort Xia, Rui
collection PubMed
description Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10–15%. Here, we present a statistical model to capture information, contained in the individual's germline haplotypes, about expected patterns in the B allele frequencies from SNP microarrays while fully modeling their magnitude, the first such model for SNP microarray data. Our model consists of a pair of hidden Markov models—one for the germline and one for the tumor genome—which, conditional on the observed array data and patterns of population haplotype variation, have a dependence structure induced by the relative imbalance of an individual's inherited haplotypes. Together, these hidden Markov models offer a powerful approach for dealing with mixtures of DNA where the main component represents the germline, thus suggesting natural applications for the characterization of primary clones when stromal contamination is extremely high, and for identifying lesions in rare subclones of a tumor when tumor purity is sufficient to characterize the primary lesions. Our joint model for germline haplotypes and acquired DNA aberration is flexible, allowing a large number of chromosomal alterations, including balanced and imbalanced losses and gains, copy-neutral loss-of-heterozygosity (LOH) and tetraploidy. We found our model (which we term J-LOH) to be superior for localizing rare aberrations in a simulated 3% mixture sample. More generally, our model provides a framework for full integration of the germline and tumor genomes to deal more effectively with missing or uncertain features, and thus extract maximal information from difficult scenarios where existing methods fail.
format Online
Article
Text
id pubmed-4148184
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41481842014-08-29 Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism Xia, Rui Vattathil, Selina Scheet, Paul PLoS Comput Biol Research Article Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10–15%. Here, we present a statistical model to capture information, contained in the individual's germline haplotypes, about expected patterns in the B allele frequencies from SNP microarrays while fully modeling their magnitude, the first such model for SNP microarray data. Our model consists of a pair of hidden Markov models—one for the germline and one for the tumor genome—which, conditional on the observed array data and patterns of population haplotype variation, have a dependence structure induced by the relative imbalance of an individual's inherited haplotypes. Together, these hidden Markov models offer a powerful approach for dealing with mixtures of DNA where the main component represents the germline, thus suggesting natural applications for the characterization of primary clones when stromal contamination is extremely high, and for identifying lesions in rare subclones of a tumor when tumor purity is sufficient to characterize the primary lesions. Our joint model for germline haplotypes and acquired DNA aberration is flexible, allowing a large number of chromosomal alterations, including balanced and imbalanced losses and gains, copy-neutral loss-of-heterozygosity (LOH) and tetraploidy. We found our model (which we term J-LOH) to be superior for localizing rare aberrations in a simulated 3% mixture sample. More generally, our model provides a framework for full integration of the germline and tumor genomes to deal more effectively with missing or uncertain features, and thus extract maximal information from difficult scenarios where existing methods fail. Public Library of Science 2014-08-28 /pmc/articles/PMC4148184/ /pubmed/25166618 http://dx.doi.org/10.1371/journal.pcbi.1003765 Text en © 2014 Xia et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xia, Rui
Vattathil, Selina
Scheet, Paul
Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title_full Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title_fullStr Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title_full_unstemmed Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title_short Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
title_sort identification of allelic imbalance with a statistical model for subtle genomic mosaicism
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148184/
https://www.ncbi.nlm.nih.gov/pubmed/25166618
http://dx.doi.org/10.1371/journal.pcbi.1003765
work_keys_str_mv AT xiarui identificationofallelicimbalancewithastatisticalmodelforsubtlegenomicmosaicism
AT vattathilselina identificationofallelicimbalancewithastatisticalmodelforsubtlegenomicmosaicism
AT scheetpaul identificationofallelicimbalancewithastatisticalmodelforsubtlegenomicmosaicism