Cargando…

Performance evaluation of indel calling tools using real short-read data

BACKGROUND: Insertion and deletion (indel), a common form of genetic variation, has been shown to cause or contribute to human genetic diseases and cancer. With the advance of next-generation sequencing technology, many indel calling tools have been developed; however, evaluation and comparison of t...

Descripción completa

Detalles Bibliográficos
Autores principales: Hasan, Mohammad Shabbir, Wu, Xiaowei, Zhang, Liqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4545535/
https://www.ncbi.nlm.nih.gov/pubmed/26286629
http://dx.doi.org/10.1186/s40246-015-0042-2
_version_ 1782386755862790144
author Hasan, Mohammad Shabbir
Wu, Xiaowei
Zhang, Liqing
author_facet Hasan, Mohammad Shabbir
Wu, Xiaowei
Zhang, Liqing
author_sort Hasan, Mohammad Shabbir
collection PubMed
description BACKGROUND: Insertion and deletion (indel), a common form of genetic variation, has been shown to cause or contribute to human genetic diseases and cancer. With the advance of next-generation sequencing technology, many indel calling tools have been developed; however, evaluation and comparison of these tools using large-scale real data are still scant. Here we evaluated seven popular and publicly available indel calling tools, GATK Unified Genotyper, VarScan, Pindel, SAMtools, Dindel, GTAK HaplotypeCaller, and Platypus, using 78 human genome low-coverage data from the 1000 Genomes project. RESULTS: Comparing indels called by these tools with a known set of indels, we found that Platypus outperforms other tools. In addition, a high percentage of known indels still remain undetected and the number of common indels called by all seven tools is very low. CONCLUSION: All these findings indicate the necessity of improving the existing tools or developing new algorithms to achieve reliable and consistent indel calling results. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40246-015-0042-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4545535
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45455352015-08-23 Performance evaluation of indel calling tools using real short-read data Hasan, Mohammad Shabbir Wu, Xiaowei Zhang, Liqing Hum Genomics Review BACKGROUND: Insertion and deletion (indel), a common form of genetic variation, has been shown to cause or contribute to human genetic diseases and cancer. With the advance of next-generation sequencing technology, many indel calling tools have been developed; however, evaluation and comparison of these tools using large-scale real data are still scant. Here we evaluated seven popular and publicly available indel calling tools, GATK Unified Genotyper, VarScan, Pindel, SAMtools, Dindel, GTAK HaplotypeCaller, and Platypus, using 78 human genome low-coverage data from the 1000 Genomes project. RESULTS: Comparing indels called by these tools with a known set of indels, we found that Platypus outperforms other tools. In addition, a high percentage of known indels still remain undetected and the number of common indels called by all seven tools is very low. CONCLUSION: All these findings indicate the necessity of improving the existing tools or developing new algorithms to achieve reliable and consistent indel calling results. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40246-015-0042-2) contains supplementary material, which is available to authorized users. BioMed Central 2015-08-19 /pmc/articles/PMC4545535/ /pubmed/26286629 http://dx.doi.org/10.1186/s40246-015-0042-2 Text en © Hasan et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Review
Hasan, Mohammad Shabbir
Wu, Xiaowei
Zhang, Liqing
Performance evaluation of indel calling tools using real short-read data
title Performance evaluation of indel calling tools using real short-read data
title_full Performance evaluation of indel calling tools using real short-read data
title_fullStr Performance evaluation of indel calling tools using real short-read data
title_full_unstemmed Performance evaluation of indel calling tools using real short-read data
title_short Performance evaluation of indel calling tools using real short-read data
title_sort performance evaluation of indel calling tools using real short-read data
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4545535/
https://www.ncbi.nlm.nih.gov/pubmed/26286629
http://dx.doi.org/10.1186/s40246-015-0042-2
work_keys_str_mv AT hasanmohammadshabbir performanceevaluationofindelcallingtoolsusingrealshortreaddata
AT wuxiaowei performanceevaluationofindelcallingtoolsusingrealshortreaddata
AT zhangliqing performanceevaluationofindelcallingtoolsusingrealshortreaddata