Cargando…

LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads

SUMMARY: Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AG...

Descripción completa

Detalles Bibliográficos
Autores principales: Tran, Quang, Abyzov, Alexej
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128450/
https://www.ncbi.nlm.nih.gov/pubmed/32777815
http://dx.doi.org/10.1093/bioinformatics/btaa703
_version_ 1783694114429075456
author Tran, Quang
Abyzov, Alexej
author_facet Tran, Quang
Abyzov, Alexej
author_sort Tran, Quang
collection PubMed
description SUMMARY: Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation—LongAGE—based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations. AVAILABILITY AND IMPLEMENTATION: LongAGE is implemented in C++ and available on Github at https://github.com/Coaxecva/LongAGE. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8128450
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81284502021-05-21 LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads Tran, Quang Abyzov, Alexej Bioinformatics Applications Notes SUMMARY: Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation—LongAGE—based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations. AVAILABILITY AND IMPLEMENTATION: LongAGE is implemented in C++ and available on Github at https://github.com/Coaxecva/LongAGE. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-08-10 /pmc/articles/PMC8128450/ /pubmed/32777815 http://dx.doi.org/10.1093/bioinformatics/btaa703 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Tran, Quang
Abyzov, Alexej
LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title_full LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title_fullStr LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title_full_unstemmed LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title_short LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
title_sort longage: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128450/
https://www.ncbi.nlm.nih.gov/pubmed/32777815
http://dx.doi.org/10.1093/bioinformatics/btaa703
work_keys_str_mv AT tranquang longagedefiningbreakpointsofgenomicstructuralvariantsthroughoptimalandmemoryefficientalignmentsoflongreads
AT abyzovalexej longagedefiningbreakpointsofgenomicstructuralvariantsthroughoptimalandmemoryefficientalignmentsoflongreads