Cargando…
Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We devel...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3792961/ https://www.ncbi.nlm.nih.gov/pubmed/24116042 http://dx.doi.org/10.1371/journal.pone.0075402 |
_version_ | 1782286909057269760 |
---|---|
author | Kosugi, Shunichi Natsume, Satoshi Yoshida, Kentaro MacLean, Daniel Cano, Liliana Kamoun, Sophien Terauchi, Ryohei |
author_facet | Kosugi, Shunichi Natsume, Satoshi Yoshida, Kentaro MacLean, Daniel Cano, Liliana Kamoun, Sophien Terauchi, Ryohei |
author_sort | Kosugi, Shunichi |
collection | PubMed |
description | Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. |
format | Online Article Text |
id | pubmed-3792961 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37929612013-10-10 Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data Kosugi, Shunichi Natsume, Satoshi Yoshida, Kentaro MacLean, Daniel Cano, Liliana Kamoun, Sophien Terauchi, Ryohei PLoS One Research Article Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. Public Library of Science 2013-10-08 /pmc/articles/PMC3792961/ /pubmed/24116042 http://dx.doi.org/10.1371/journal.pone.0075402 Text en © 2013 Kosugi et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Kosugi, Shunichi Natsume, Satoshi Yoshida, Kentaro MacLean, Daniel Cano, Liliana Kamoun, Sophien Terauchi, Ryohei Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title | Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title_full | Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title_fullStr | Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title_full_unstemmed | Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title_short | Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data |
title_sort | coval: improving alignment quality and variant calling accuracy for next-generation sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3792961/ https://www.ncbi.nlm.nih.gov/pubmed/24116042 http://dx.doi.org/10.1371/journal.pone.0075402 |
work_keys_str_mv | AT kosugishunichi covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT natsumesatoshi covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT yoshidakentaro covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT macleandaniel covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT canoliliana covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT kamounsophien covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata AT terauchiryohei covalimprovingalignmentqualityandvariantcallingaccuracyfornextgenerationsequencingdata |