Cargando…
INSurVeyor: improving insertion calling from short read sequencing data
Insertions are one of the major types of structural variations and are defined as the addition of 50 nucleotides or more into a DNA sequence. Several methods exist to detect insertions from next-generation sequencing short read data, but they generally have low sensitivity. Our contribution is two-f...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241795/ https://www.ncbi.nlm.nih.gov/pubmed/37277343 http://dx.doi.org/10.1038/s41467-023-38870-2 |
_version_ | 1785054067972636672 |
---|---|
author | Rajaby, Ramesh Liu, Dong-Xu Au, Chun Hang Cheung, Yuen-Ting Lau, Amy Yuet Ting Yang, Qing-Yong Sung, Wing-Kin |
author_facet | Rajaby, Ramesh Liu, Dong-Xu Au, Chun Hang Cheung, Yuen-Ting Lau, Amy Yuet Ting Yang, Qing-Yong Sung, Wing-Kin |
author_sort | Rajaby, Ramesh |
collection | PubMed |
description | Insertions are one of the major types of structural variations and are defined as the addition of 50 nucleotides or more into a DNA sequence. Several methods exist to detect insertions from next-generation sequencing short read data, but they generally have low sensitivity. Our contribution is two-fold. First, we introduce INSurVeyor, a fast, sensitive and precise method that detects insertions from next-generation sequencing paired-end data. Using publicly available benchmark datasets (both human and non-human), we show that INSurVeyor is not only more sensitive than any individual caller we tested, but also more sensitive than all of them combined. Furthermore, for most types of insertions, INSurVeyor is almost as sensitive as long reads callers. Second, we provide state-of-the-art catalogues of insertions for 1047 Arabidopsis Thaliana genomes from the 1001 Genomes Project and 3202 human genomes from the 1000 Genomes Project, both generated with INSurVeyor. We show that they are more complete and precise than existing resources, and important insertions are missed by existing methods. |
format | Online Article Text |
id | pubmed-10241795 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-102417952023-06-07 INSurVeyor: improving insertion calling from short read sequencing data Rajaby, Ramesh Liu, Dong-Xu Au, Chun Hang Cheung, Yuen-Ting Lau, Amy Yuet Ting Yang, Qing-Yong Sung, Wing-Kin Nat Commun Article Insertions are one of the major types of structural variations and are defined as the addition of 50 nucleotides or more into a DNA sequence. Several methods exist to detect insertions from next-generation sequencing short read data, but they generally have low sensitivity. Our contribution is two-fold. First, we introduce INSurVeyor, a fast, sensitive and precise method that detects insertions from next-generation sequencing paired-end data. Using publicly available benchmark datasets (both human and non-human), we show that INSurVeyor is not only more sensitive than any individual caller we tested, but also more sensitive than all of them combined. Furthermore, for most types of insertions, INSurVeyor is almost as sensitive as long reads callers. Second, we provide state-of-the-art catalogues of insertions for 1047 Arabidopsis Thaliana genomes from the 1001 Genomes Project and 3202 human genomes from the 1000 Genomes Project, both generated with INSurVeyor. We show that they are more complete and precise than existing resources, and important insertions are missed by existing methods. Nature Publishing Group UK 2023-06-05 /pmc/articles/PMC10241795/ /pubmed/37277343 http://dx.doi.org/10.1038/s41467-023-38870-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Rajaby, Ramesh Liu, Dong-Xu Au, Chun Hang Cheung, Yuen-Ting Lau, Amy Yuet Ting Yang, Qing-Yong Sung, Wing-Kin INSurVeyor: improving insertion calling from short read sequencing data |
title | INSurVeyor: improving insertion calling from short read sequencing data |
title_full | INSurVeyor: improving insertion calling from short read sequencing data |
title_fullStr | INSurVeyor: improving insertion calling from short read sequencing data |
title_full_unstemmed | INSurVeyor: improving insertion calling from short read sequencing data |
title_short | INSurVeyor: improving insertion calling from short read sequencing data |
title_sort | insurveyor: improving insertion calling from short read sequencing data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241795/ https://www.ncbi.nlm.nih.gov/pubmed/37277343 http://dx.doi.org/10.1038/s41467-023-38870-2 |
work_keys_str_mv | AT rajabyramesh insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT liudongxu insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT auchunhang insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT cheungyuenting insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT lauamyyuetting insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT yangqingyong insurveyorimprovinginsertioncallingfromshortreadsequencingdata AT sungwingkin insurveyorimprovinginsertioncallingfromshortreadsequencingdata |