Cargando…

Indel detection from DNA and RNA sequencing data with transIndel

BACKGROUND: Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large ind...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Rendong, Van Etten, Jamie L., Dehm, Scott M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909256/
https://www.ncbi.nlm.nih.gov/pubmed/29673323
http://dx.doi.org/10.1186/s12864-018-4671-4
_version_ 1783315863574675456
author Yang, Rendong
Van Etten, Jamie L.
Dehm, Scott M.
author_facet Yang, Rendong
Van Etten, Jamie L.
Dehm, Scott M.
author_sort Yang, Rendong
collection PubMed
description BACKGROUND: Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large indels from splicing events in RNA-seq data. RESULTS: Here, we developed transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. TransIndel exhibits competitive or superior performance over eight state-of-the-art indel detection tools on benchmarks using both synthetic and real DNA-seq data. Additionally, we applied transIndel to DNA-seq and RNA-seq datasets from 333 primary prostate cancer patients from The Cancer Genome Atlas (TCGA) and 59 metastatic prostate cancer patients from AACR-PCF Stand-Up- To-Cancer (SU2C) studies. TransIndel enhanced the taxonomy of DNA- and RNA-level alterations in prostate cancer by identifying recurrent FOXA1 indels as well as exitron splicing in genes implicated in disease progression. CONCLUSIONS: Our study demonstrates that transIndel is a robust tool for elucidation of medium- and large-sized indels from DNA-seq and RNA-seq data. Including RNA-seq in indel discovery efforts leads to significant improvements in sensitivity for identification of med-sized and large indels missed by DNA-seq, and reveals non-canonical RNA-splicing events in genes associated with disease pathology. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4671-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5909256
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59092562018-04-30 Indel detection from DNA and RNA sequencing data with transIndel Yang, Rendong Van Etten, Jamie L. Dehm, Scott M. BMC Genomics Software BACKGROUND: Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large indels from splicing events in RNA-seq data. RESULTS: Here, we developed transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. TransIndel exhibits competitive or superior performance over eight state-of-the-art indel detection tools on benchmarks using both synthetic and real DNA-seq data. Additionally, we applied transIndel to DNA-seq and RNA-seq datasets from 333 primary prostate cancer patients from The Cancer Genome Atlas (TCGA) and 59 metastatic prostate cancer patients from AACR-PCF Stand-Up- To-Cancer (SU2C) studies. TransIndel enhanced the taxonomy of DNA- and RNA-level alterations in prostate cancer by identifying recurrent FOXA1 indels as well as exitron splicing in genes implicated in disease progression. CONCLUSIONS: Our study demonstrates that transIndel is a robust tool for elucidation of medium- and large-sized indels from DNA-seq and RNA-seq data. Including RNA-seq in indel discovery efforts leads to significant improvements in sensitivity for identification of med-sized and large indels missed by DNA-seq, and reveals non-canonical RNA-splicing events in genes associated with disease pathology. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4671-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-04-19 /pmc/articles/PMC5909256/ /pubmed/29673323 http://dx.doi.org/10.1186/s12864-018-4671-4 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Yang, Rendong
Van Etten, Jamie L.
Dehm, Scott M.
Indel detection from DNA and RNA sequencing data with transIndel
title Indel detection from DNA and RNA sequencing data with transIndel
title_full Indel detection from DNA and RNA sequencing data with transIndel
title_fullStr Indel detection from DNA and RNA sequencing data with transIndel
title_full_unstemmed Indel detection from DNA and RNA sequencing data with transIndel
title_short Indel detection from DNA and RNA sequencing data with transIndel
title_sort indel detection from dna and rna sequencing data with transindel
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909256/
https://www.ncbi.nlm.nih.gov/pubmed/29673323
http://dx.doi.org/10.1186/s12864-018-4671-4
work_keys_str_mv AT yangrendong indeldetectionfromdnaandrnasequencingdatawithtransindel
AT vanettenjamiel indeldetectionfromdnaandrnasequencingdatawithtransindel
AT dehmscottm indeldetectionfromdnaandrnasequencingdatawithtransindel