Cargando…
SOAPindel: Efficient identification of indels from short paired reads
We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo as...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530679/ https://www.ncbi.nlm.nih.gov/pubmed/22972939 http://dx.doi.org/10.1101/gr.132480.111 |
_version_ | 1782254046259707904 |
---|---|
author | Li, Shengting Li, Ruiqiang Li, Heng Lu, Jianliang Li, Yingrui Bolund, Lars Schierup, Mikkel H. Wang, Jun |
author_facet | Li, Shengting Li, Ruiqiang Li, Heng Lu, Jianliang Li, Yingrui Bolund, Lars Schierup, Mikkel H. Wang, Jun |
author_sort | Li, Shengting |
collection | PubMed |
description | We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of ∼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches. |
format | Online Article Text |
id | pubmed-3530679 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-35306792013-07-01 SOAPindel: Efficient identification of indels from short paired reads Li, Shengting Li, Ruiqiang Li, Heng Lu, Jianliang Li, Yingrui Bolund, Lars Schierup, Mikkel H. Wang, Jun Genome Res Resource We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of ∼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches. Cold Spring Harbor Laboratory Press 2013-01 /pmc/articles/PMC3530679/ /pubmed/22972939 http://dx.doi.org/10.1101/gr.132480.111 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. |
spellingShingle | Resource Li, Shengting Li, Ruiqiang Li, Heng Lu, Jianliang Li, Yingrui Bolund, Lars Schierup, Mikkel H. Wang, Jun SOAPindel: Efficient identification of indels from short paired reads |
title | SOAPindel: Efficient identification of indels from short paired reads |
title_full | SOAPindel: Efficient identification of indels from short paired reads |
title_fullStr | SOAPindel: Efficient identification of indels from short paired reads |
title_full_unstemmed | SOAPindel: Efficient identification of indels from short paired reads |
title_short | SOAPindel: Efficient identification of indels from short paired reads |
title_sort | soapindel: efficient identification of indels from short paired reads |
topic | Resource |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530679/ https://www.ncbi.nlm.nih.gov/pubmed/22972939 http://dx.doi.org/10.1101/gr.132480.111 |
work_keys_str_mv | AT lishengting soapindelefficientidentificationofindelsfromshortpairedreads AT liruiqiang soapindelefficientidentificationofindelsfromshortpairedreads AT liheng soapindelefficientidentificationofindelsfromshortpairedreads AT lujianliang soapindelefficientidentificationofindelsfromshortpairedreads AT liyingrui soapindelefficientidentificationofindelsfromshortpairedreads AT bolundlars soapindelefficientidentificationofindelsfromshortpairedreads AT schierupmikkelh soapindelefficientidentificationofindelsfromshortpairedreads AT wangjun soapindelefficientidentificationofindelsfromshortpairedreads |