Cargando…

Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data

The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and othe...

Descripción completa

Detalles Bibliográficos
Autores principales: Stephens, Zachary, O’Brien, Daniel, Dehankar, Mrunal, Roberts, Lewis R., Iyer, Ravishankar K., Kocher, Jean-Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457494/
https://www.ncbi.nlm.nih.gov/pubmed/34550971
http://dx.doi.org/10.1371/journal.pone.0250915
_version_ 1784571109966872576
author Stephens, Zachary
O’Brien, Daniel
Dehankar, Mrunal
Roberts, Lewis R.
Iyer, Ravishankar K.
Kocher, Jean-Pierre
author_facet Stephens, Zachary
O’Brien, Daniel
Dehankar, Mrunal
Roberts, Lewis R.
Iyer, Ravishankar K.
Kocher, Jean-Pierre
author_sort Stephens, Zachary
collection PubMed
description The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene’s read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with long read validation. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are also supported by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq and targeted capture.
format Online
Article
Text
id pubmed-8457494
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-84574942021-09-23 Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data Stephens, Zachary O’Brien, Daniel Dehankar, Mrunal Roberts, Lewis R. Iyer, Ravishankar K. Kocher, Jean-Pierre PLoS One Research Article The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene’s read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with long read validation. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are also supported by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq and targeted capture. Public Library of Science 2021-09-22 /pmc/articles/PMC8457494/ /pubmed/34550971 http://dx.doi.org/10.1371/journal.pone.0250915 Text en © 2021 Stephens et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Stephens, Zachary
O’Brien, Daniel
Dehankar, Mrunal
Roberts, Lewis R.
Iyer, Ravishankar K.
Kocher, Jean-Pierre
Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title_full Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title_fullStr Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title_full_unstemmed Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title_short Exogene: A performant workflow for detecting viral integrations from paired-end next-generation sequencing data
title_sort exogene: a performant workflow for detecting viral integrations from paired-end next-generation sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457494/
https://www.ncbi.nlm.nih.gov/pubmed/34550971
http://dx.doi.org/10.1371/journal.pone.0250915
work_keys_str_mv AT stephenszachary exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata
AT obriendaniel exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata
AT dehankarmrunal exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata
AT robertslewisr exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata
AT iyerravishankark exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata
AT kocherjeanpierre exogeneaperformantworkflowfordetectingviralintegrationsfrompairedendnextgenerationsequencingdata