Cargando…

Detection and visualization of complex structural variants from long reads

BACKGROUND: With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short re...

Descripción completa

Detalles Bibliográficos
Autores principales: Stephens, Zachary, Wang, Chen, Iyer, Ravishankar K., Kocher, Jean-Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302372/
https://www.ncbi.nlm.nih.gov/pubmed/30577744
http://dx.doi.org/10.1186/s12859-018-2539-x
_version_ 1783381963509334016
author Stephens, Zachary
Wang, Chen
Iyer, Ravishankar K.
Kocher, Jean-Pierre
author_facet Stephens, Zachary
Wang, Chen
Iyer, Ravishankar K.
Kocher, Jean-Pierre
author_sort Stephens, Zachary
collection PubMed
description BACKGROUND: With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short read sequencing technologies. This problem is exacerbated when considering complex SVs comprised of multiple overlapping or nested rearrangements. Longer reads, such as those from Pacific Biosciences platforms, often span multiple breakpoints of such events, and thus provide a way to unravel small-scale complexities in SVs with higher confidence. RESULTS: We present CORGi (COmplex Rearrangement detection with Graph-search), a method for the detection and visualization of complex local genomic rearrangements. This method leverages the ability of long reads to span multiple breakpoints to untangle SVs that appear very complicated with respect to a reference genome. We validated our approach against both simulated long reads, and real data from two long read sequencing technologies. We demonstrate the ability of our method to identify breakpoints inserted in synthetic data with high accuracy, and the ability to detect and plot SVs from NA12878 germline, achieving 88.4% concordance between the two sets of sequence data. The patterns of complexity we find in many NA12878 SVs match known mechanisms associated with DNA replication and structural variant formation, and highlight the ability of our method to automatically label complex SVs with an intuitive combination of adjacent or overlapping reference transformations. CONCLUSIONS: CORGi is a method for interrogating genomic regions suspected to contain local rearrangements using long reads. Using pairwise alignments and graph search CORGi produces labels and visualizations for local SVs of arbitrary complexity.
format Online
Article
Text
id pubmed-6302372
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63023722018-12-31 Detection and visualization of complex structural variants from long reads Stephens, Zachary Wang, Chen Iyer, Ravishankar K. Kocher, Jean-Pierre BMC Bioinformatics Research BACKGROUND: With applications in cancer, drug metabolism, and disease etiology, understanding structural variation in the human genome is critical in advancing the thrusts of individualized medicine. However, structural variants (SVs) remain challenging to detect with high sensitivity using short read sequencing technologies. This problem is exacerbated when considering complex SVs comprised of multiple overlapping or nested rearrangements. Longer reads, such as those from Pacific Biosciences platforms, often span multiple breakpoints of such events, and thus provide a way to unravel small-scale complexities in SVs with higher confidence. RESULTS: We present CORGi (COmplex Rearrangement detection with Graph-search), a method for the detection and visualization of complex local genomic rearrangements. This method leverages the ability of long reads to span multiple breakpoints to untangle SVs that appear very complicated with respect to a reference genome. We validated our approach against both simulated long reads, and real data from two long read sequencing technologies. We demonstrate the ability of our method to identify breakpoints inserted in synthetic data with high accuracy, and the ability to detect and plot SVs from NA12878 germline, achieving 88.4% concordance between the two sets of sequence data. The patterns of complexity we find in many NA12878 SVs match known mechanisms associated with DNA replication and structural variant formation, and highlight the ability of our method to automatically label complex SVs with an intuitive combination of adjacent or overlapping reference transformations. CONCLUSIONS: CORGi is a method for interrogating genomic regions suspected to contain local rearrangements using long reads. Using pairwise alignments and graph search CORGi produces labels and visualizations for local SVs of arbitrary complexity. BioMed Central 2018-12-21 /pmc/articles/PMC6302372/ /pubmed/30577744 http://dx.doi.org/10.1186/s12859-018-2539-x Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Stephens, Zachary
Wang, Chen
Iyer, Ravishankar K.
Kocher, Jean-Pierre
Detection and visualization of complex structural variants from long reads
title Detection and visualization of complex structural variants from long reads
title_full Detection and visualization of complex structural variants from long reads
title_fullStr Detection and visualization of complex structural variants from long reads
title_full_unstemmed Detection and visualization of complex structural variants from long reads
title_short Detection and visualization of complex structural variants from long reads
title_sort detection and visualization of complex structural variants from long reads
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302372/
https://www.ncbi.nlm.nih.gov/pubmed/30577744
http://dx.doi.org/10.1186/s12859-018-2539-x
work_keys_str_mv AT stephenszachary detectionandvisualizationofcomplexstructuralvariantsfromlongreads
AT wangchen detectionandvisualizationofcomplexstructuralvariantsfromlongreads
AT iyerravishankark detectionandvisualizationofcomplexstructuralvariantsfromlongreads
AT kocherjeanpierre detectionandvisualizationofcomplexstructuralvariantsfromlongreads