Cargando…
Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches
Complementary to reference-based variant detection, recent studies revealed that many novel variants could be detected with de novo assembled genomes. To evaluate the effect of reads coverage and the accuracy of assembly-based variant calling, we simulated short reads containing more than 3 million...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5591230/ https://www.ncbi.nlm.nih.gov/pubmed/28887485 http://dx.doi.org/10.1038/s41598-017-10826-9 |
_version_ | 1783262669577388032 |
---|---|
author | Wu, Leihong Yavas, Gokhan Hong, Huixiao Tong, Weida Xiao, Wenming |
author_facet | Wu, Leihong Yavas, Gokhan Hong, Huixiao Tong, Weida Xiao, Wenming |
author_sort | Wu, Leihong |
collection | PubMed |
description | Complementary to reference-based variant detection, recent studies revealed that many novel variants could be detected with de novo assembled genomes. To evaluate the effect of reads coverage and the accuracy of assembly-based variant calling, we simulated short reads containing more than 3 million of single nucleotide variants (SNVs) from the whole human genome and compared the efficiency of SNV calling between the assembly-based and alignment-based calling approaches. We assessed the quality of the assembled contig and found that a minimum of 30X coverage of short reads was needed to ensure reliable SNV calling and to generate assembled contigs with a good coverage of genome and genes. In addition, we observed that the assembly-based approach had a much lower recall rate and precision comparing to the alignment-based approach that would recover 99% of imputed SNVs. We observed similar results with experimental reads for NA24385, an individual whose germline variants were well characterized. Although there are additional values for SNVs detection, the assembly-based approach would have great risk of false discovery of novel SNVs. Further improvement of de novo assembly algorithms are needed in order to warrant a good completeness of genome with haplotype resolved and high fidelity of assembled sequences. |
format | Online Article Text |
id | pubmed-5591230 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-55912302017-09-13 Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches Wu, Leihong Yavas, Gokhan Hong, Huixiao Tong, Weida Xiao, Wenming Sci Rep Article Complementary to reference-based variant detection, recent studies revealed that many novel variants could be detected with de novo assembled genomes. To evaluate the effect of reads coverage and the accuracy of assembly-based variant calling, we simulated short reads containing more than 3 million of single nucleotide variants (SNVs) from the whole human genome and compared the efficiency of SNV calling between the assembly-based and alignment-based calling approaches. We assessed the quality of the assembled contig and found that a minimum of 30X coverage of short reads was needed to ensure reliable SNV calling and to generate assembled contigs with a good coverage of genome and genes. In addition, we observed that the assembly-based approach had a much lower recall rate and precision comparing to the alignment-based approach that would recover 99% of imputed SNVs. We observed similar results with experimental reads for NA24385, an individual whose germline variants were well characterized. Although there are additional values for SNVs detection, the assembly-based approach would have great risk of false discovery of novel SNVs. Further improvement of de novo assembly algorithms are needed in order to warrant a good completeness of genome with haplotype resolved and high fidelity of assembled sequences. Nature Publishing Group UK 2017-09-08 /pmc/articles/PMC5591230/ /pubmed/28887485 http://dx.doi.org/10.1038/s41598-017-10826-9 Text en © The Author(s) 2017 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Wu, Leihong Yavas, Gokhan Hong, Huixiao Tong, Weida Xiao, Wenming Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title | Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title_full | Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title_fullStr | Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title_full_unstemmed | Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title_short | Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
title_sort | direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5591230/ https://www.ncbi.nlm.nih.gov/pubmed/28887485 http://dx.doi.org/10.1038/s41598-017-10826-9 |
work_keys_str_mv | AT wuleihong directcomparisonofperformanceofsinglenucleotidevariantcallinginhumangenomewithalignmentbasedandassemblybasedapproaches AT yavasgokhan directcomparisonofperformanceofsinglenucleotidevariantcallinginhumangenomewithalignmentbasedandassemblybasedapproaches AT honghuixiao directcomparisonofperformanceofsinglenucleotidevariantcallinginhumangenomewithalignmentbasedandassemblybasedapproaches AT tongweida directcomparisonofperformanceofsinglenucleotidevariantcallinginhumangenomewithalignmentbasedandassemblybasedapproaches AT xiaowenming directcomparisonofperformanceofsinglenucleotidevariantcallinginhumangenomewithalignmentbasedandassemblybasedapproaches |