Cargando…
hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation
BACKGROUND: Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tu...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236906/ https://www.ncbi.nlm.nih.gov/pubmed/30428830 http://dx.doi.org/10.1186/s12859-018-2412-y |
_version_ | 1783371107895607296 |
---|---|
author | Choo-Wosoba, Hyoyoung Albert, Paul S. Zhu, Bin |
author_facet | Choo-Wosoba, Hyoyoung Albert, Paul S. Zhu, Bin |
author_sort | Choo-Wosoba, Hyoyoung |
collection | PubMed |
description | BACKGROUND: Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA. RESULTS: We propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an efficient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study. CONCLUSIONS: We demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2412-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6236906 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62369062018-11-20 hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation Choo-Wosoba, Hyoyoung Albert, Paul S. Zhu, Bin BMC Bioinformatics Methodology Article BACKGROUND: Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA. RESULTS: We propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an efficient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study. CONCLUSIONS: We demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2412-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-14 /pmc/articles/PMC6236906/ /pubmed/30428830 http://dx.doi.org/10.1186/s12859-018-2412-y Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Choo-Wosoba, Hyoyoung Albert, Paul S. Zhu, Bin hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title | hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title_full | hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title_fullStr | hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title_full_unstemmed | hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title_short | hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
title_sort | hseghmm: hidden markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236906/ https://www.ncbi.nlm.nih.gov/pubmed/30428830 http://dx.doi.org/10.1186/s12859-018-2412-y |
work_keys_str_mv | AT choowosobahyoyoung hseghmmhiddenmarkovmodelbasedallelespecificcopynumberalterationanalysisaccountingforhypersegmentation AT albertpauls hseghmmhiddenmarkovmodelbasedallelespecificcopynumberalterationanalysisaccountingforhypersegmentation AT zhubin hseghmmhiddenmarkovmodelbasedallelespecificcopynumberalterationanalysisaccountingforhypersegmentation |