Cargando…

HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering

Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estim...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Yen-Jen, Chen, Yu-Tin, Hsu, Shu-Ni, Peng, Chien-Hua, Tang, Chuan-Yi, Yen, Tzu-Chen, Hsieh, Wen-Ping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029584/
https://www.ncbi.nlm.nih.gov/pubmed/24849202
http://dx.doi.org/10.1371/journal.pone.0096841
_version_ 1782317236779745280
author Lin, Yen-Jen
Chen, Yu-Tin
Hsu, Shu-Ni
Peng, Chien-Hua
Tang, Chuan-Yi
Yen, Tzu-Chen
Hsieh, Wen-Ping
author_facet Lin, Yen-Jen
Chen, Yu-Tin
Hsu, Shu-Ni
Peng, Chien-Hua
Tang, Chuan-Yi
Yen, Tzu-Chen
Hsieh, Wen-Ping
author_sort Lin, Yen-Jen
collection PubMed
description Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states.
format Online
Article
Text
id pubmed-4029584
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40295842014-05-28 HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering Lin, Yen-Jen Chen, Yu-Tin Hsu, Shu-Ni Peng, Chien-Hua Tang, Chuan-Yi Yen, Tzu-Chen Hsieh, Wen-Ping PLoS One Research Article Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states. Public Library of Science 2014-05-21 /pmc/articles/PMC4029584/ /pubmed/24849202 http://dx.doi.org/10.1371/journal.pone.0096841 Text en © 2014 Lin et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lin, Yen-Jen
Chen, Yu-Tin
Hsu, Shu-Ni
Peng, Chien-Hua
Tang, Chuan-Yi
Yen, Tzu-Chen
Hsieh, Wen-Ping
HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title_full HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title_fullStr HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title_full_unstemmed HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title_short HaplotypeCN: Copy Number Haplotype Inference with Hidden Markov Model and Localized Haplotype Clustering
title_sort haplotypecn: copy number haplotype inference with hidden markov model and localized haplotype clustering
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029584/
https://www.ncbi.nlm.nih.gov/pubmed/24849202
http://dx.doi.org/10.1371/journal.pone.0096841
work_keys_str_mv AT linyenjen haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT chenyutin haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT hsushuni haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT pengchienhua haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT tangchuanyi haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT yentzuchen haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering
AT hsiehwenping haplotypecncopynumberhaplotypeinferencewithhiddenmarkovmodelandlocalizedhaplotypeclustering