Cargando…

SOAPindel: Efficient identification of indels from short paired reads

We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo as...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Shengting, Li, Ruiqiang, Li, Heng, Lu, Jianliang, Li, Yingrui, Bolund, Lars, Schierup, Mikkel H., Wang, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530679/
https://www.ncbi.nlm.nih.gov/pubmed/22972939
http://dx.doi.org/10.1101/gr.132480.111
_version_ 1782254046259707904
author Li, Shengting
Li, Ruiqiang
Li, Heng
Lu, Jianliang
Li, Yingrui
Bolund, Lars
Schierup, Mikkel H.
Wang, Jun
author_facet Li, Shengting
Li, Ruiqiang
Li, Heng
Lu, Jianliang
Li, Yingrui
Bolund, Lars
Schierup, Mikkel H.
Wang, Jun
author_sort Li, Shengting
collection PubMed
description We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of ∼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches.
format Online
Article
Text
id pubmed-3530679
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-35306792013-07-01 SOAPindel: Efficient identification of indels from short paired reads Li, Shengting Li, Ruiqiang Li, Heng Lu, Jianliang Li, Yingrui Bolund, Lars Schierup, Mikkel H. Wang, Jun Genome Res Resource We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of ∼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches. Cold Spring Harbor Laboratory Press 2013-01 /pmc/articles/PMC3530679/ /pubmed/22972939 http://dx.doi.org/10.1101/gr.132480.111 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Resource
Li, Shengting
Li, Ruiqiang
Li, Heng
Lu, Jianliang
Li, Yingrui
Bolund, Lars
Schierup, Mikkel H.
Wang, Jun
SOAPindel: Efficient identification of indels from short paired reads
title SOAPindel: Efficient identification of indels from short paired reads
title_full SOAPindel: Efficient identification of indels from short paired reads
title_fullStr SOAPindel: Efficient identification of indels from short paired reads
title_full_unstemmed SOAPindel: Efficient identification of indels from short paired reads
title_short SOAPindel: Efficient identification of indels from short paired reads
title_sort soapindel: efficient identification of indels from short paired reads
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530679/
https://www.ncbi.nlm.nih.gov/pubmed/22972939
http://dx.doi.org/10.1101/gr.132480.111
work_keys_str_mv AT lishengting soapindelefficientidentificationofindelsfromshortpairedreads
AT liruiqiang soapindelefficientidentificationofindelsfromshortpairedreads
AT liheng soapindelefficientidentificationofindelsfromshortpairedreads
AT lujianliang soapindelefficientidentificationofindelsfromshortpairedreads
AT liyingrui soapindelefficientidentificationofindelsfromshortpairedreads
AT bolundlars soapindelefficientidentificationofindelsfromshortpairedreads
AT schierupmikkelh soapindelefficientidentificationofindelsfromshortpairedreads
AT wangjun soapindelefficientidentificationofindelsfromshortpairedreads