Cargando…

GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes

BACKGROUND: Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinate...

Descripción completa

Detalles Bibliográficos
Autores principales: Coleman, Izaak, Corleone, Giacomo, Arram, James, Ng, Ho-Cheung, Magnani, Luca, Luk, Wayne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7003401/
https://www.ncbi.nlm.nih.gov/pubmed/32024475
http://dx.doi.org/10.1186/s12859-020-3367-3
_version_ 1783494527778029568
author Coleman, Izaak
Corleone, Giacomo
Arram, James
Ng, Ho-Cheung
Magnani, Luca
Luk, Wayne
author_facet Coleman, Izaak
Corleone, Giacomo
Arram, James
Ng, Ho-Cheung
Magnani, Luca
Luk, Wayne
author_sort Coleman, Izaak
collection PubMed
description BACKGROUND: Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity. RESULTS: In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (<1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline. CONCLUSION: By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods.
format Online
Article
Text
id pubmed-7003401
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70034012020-02-10 GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes Coleman, Izaak Corleone, Giacomo Arram, James Ng, Ho-Cheung Magnani, Luca Luk, Wayne BMC Bioinformatics Software BACKGROUND: Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity. RESULTS: In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (<1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline. CONCLUSION: By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods. BioMed Central 2020-02-05 /pmc/articles/PMC7003401/ /pubmed/32024475 http://dx.doi.org/10.1186/s12859-020-3367-3 Text en © The Author(s) 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Coleman, Izaak
Corleone, Giacomo
Arram, James
Ng, Ho-Cheung
Magnani, Luca
Luk, Wayne
GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title_full GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title_fullStr GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title_full_unstemmed GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title_short GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes
title_sort gedi: applying suffix arrays to increase the repertoire of detectable snvs in tumour genomes
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7003401/
https://www.ncbi.nlm.nih.gov/pubmed/32024475
http://dx.doi.org/10.1186/s12859-020-3367-3
work_keys_str_mv AT colemanizaak gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes
AT corleonegiacomo gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes
AT arramjames gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes
AT nghocheung gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes
AT magnaniluca gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes
AT lukwayne gediapplyingsuffixarraystoincreasetherepertoireofdetectablesnvsintumourgenomes