Cargando…
vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
BACKGROUND: Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make sim...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6387560/ https://www.ncbi.nlm.nih.gov/pubmed/30795817 http://dx.doi.org/10.1186/s40246-019-0194-6 |
_version_ | 1783397610104553472 |
---|---|
author | Tang, Man Hasan, Mohammad Shabbir Zhu, Hongxiao Zhang, Liqing Wu, Xiaowei |
author_facet | Tang, Man Hasan, Mohammad Shabbir Zhu, Hongxiao Zhang, Liqing Wu, Xiaowei |
author_sort | Tang, Man |
collection | PubMed |
description | BACKGROUND: Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). RESULTS AND CONCLUSION: We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F(1) score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40246-019-0194-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6387560 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63875602019-03-04 vi-HMM: a novel HMM-based method for sequence variant identification in short-read data Tang, Man Hasan, Mohammad Shabbir Zhu, Hongxiao Zhang, Liqing Wu, Xiaowei Hum Genomics Primary Research BACKGROUND: Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). RESULTS AND CONCLUSION: We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F(1) score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40246-019-0194-6) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-13 /pmc/articles/PMC6387560/ /pubmed/30795817 http://dx.doi.org/10.1186/s40246-019-0194-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Primary Research Tang, Man Hasan, Mohammad Shabbir Zhu, Hongxiao Zhang, Liqing Wu, Xiaowei vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_full | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_fullStr | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_full_unstemmed | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_short | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_sort | vi-hmm: a novel hmm-based method for sequence variant identification in short-read data |
topic | Primary Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6387560/ https://www.ncbi.nlm.nih.gov/pubmed/30795817 http://dx.doi.org/10.1186/s40246-019-0194-6 |
work_keys_str_mv | AT tangman vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT hasanmohammadshabbir vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT zhuhongxiao vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT zhangliqing vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT wuxiaowei vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata |