Cargando…

Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine

BACKGROUND: RNA-seq is the most commonly used sequencing application. Not only does it measure gene expression but it is also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels) or fusion transcripts. However, detection of...

Descripción completa

Detalles Bibliográficos
Autores principales: Prodduturi, Naresh, Bhagwate, Aditya, Kocher, Jean-Pierre A., Sun, Zhifu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157028/
https://www.ncbi.nlm.nih.gov/pubmed/30255803
http://dx.doi.org/10.1186/s12920-018-0391-5
_version_ 1783358193768857600
author Prodduturi, Naresh
Bhagwate, Aditya
Kocher, Jean-Pierre A.
Sun, Zhifu
author_facet Prodduturi, Naresh
Bhagwate, Aditya
Kocher, Jean-Pierre A.
Sun, Zhifu
author_sort Prodduturi, Naresh
collection PubMed
description BACKGROUND: RNA-seq is the most commonly used sequencing application. Not only does it measure gene expression but it is also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels) or fusion transcripts. However, detection of these variants is challenging and complex from RNA-seq. Here we describe a sensitive and accurate analytical pipeline which detects various mutations at once for translational precision medicine. METHODS: The pipeline incorporates most sensitive aligners for Indels in RNA-Seq, the best practice for data preprocessing and variant calling, and STAR-fusion is for chimeric transcripts. Variants/mutations are annotated, and key genes can be extracted for further investigation and clinical actions. Three datasets were used to evaluate the performance of the pipeline for SNVs, indels and fusion transcripts. RESULTS: For the well-defined variants from NA12878 by GIAB project, about 95% and 80% of sensitivities were obtained for SNVs and indels, respectively, in matching RNA-seq. Comparison with other variant specific tools showed good performance of the pipeline. For the lung cancer dataset with 41 known and oncogenic mutations, 39 were detected by the pipeline with STAR aligner and all by the GSNAP aligner. An actionable EML4 and ALK fusion was also detected in one of the tumors, which also demonstrated outlier ALK expression. For 9 fusions spiked-into RNA-seq libraries with different concentrations, the pipeline was able to detect all in unfiltered results although some at very low concentrations may be missed when filtering was applied. CONCLUSIONS: The new RNA-seq workflow is an accurate and comprehensive mutation profiler from RNA-seq. Key or actionable mutations are reliably detected from RNA-seq, which makes it a practical alternative source for personalized medicine. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-018-0391-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6157028
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61570282018-09-27 Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine Prodduturi, Naresh Bhagwate, Aditya Kocher, Jean-Pierre A. Sun, Zhifu BMC Med Genomics Research BACKGROUND: RNA-seq is the most commonly used sequencing application. Not only does it measure gene expression but it is also an excellent media to detect important structural variants such as single nucleotide variants (SNVs), insertion/deletion (Indels) or fusion transcripts. However, detection of these variants is challenging and complex from RNA-seq. Here we describe a sensitive and accurate analytical pipeline which detects various mutations at once for translational precision medicine. METHODS: The pipeline incorporates most sensitive aligners for Indels in RNA-Seq, the best practice for data preprocessing and variant calling, and STAR-fusion is for chimeric transcripts. Variants/mutations are annotated, and key genes can be extracted for further investigation and clinical actions. Three datasets were used to evaluate the performance of the pipeline for SNVs, indels and fusion transcripts. RESULTS: For the well-defined variants from NA12878 by GIAB project, about 95% and 80% of sensitivities were obtained for SNVs and indels, respectively, in matching RNA-seq. Comparison with other variant specific tools showed good performance of the pipeline. For the lung cancer dataset with 41 known and oncogenic mutations, 39 were detected by the pipeline with STAR aligner and all by the GSNAP aligner. An actionable EML4 and ALK fusion was also detected in one of the tumors, which also demonstrated outlier ALK expression. For 9 fusions spiked-into RNA-seq libraries with different concentrations, the pipeline was able to detect all in unfiltered results although some at very low concentrations may be missed when filtering was applied. CONCLUSIONS: The new RNA-seq workflow is an accurate and comprehensive mutation profiler from RNA-seq. Key or actionable mutations are reliably detected from RNA-seq, which makes it a practical alternative source for personalized medicine. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-018-0391-5) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-14 /pmc/articles/PMC6157028/ /pubmed/30255803 http://dx.doi.org/10.1186/s12920-018-0391-5 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Prodduturi, Naresh
Bhagwate, Aditya
Kocher, Jean-Pierre A.
Sun, Zhifu
Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title_full Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title_fullStr Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title_full_unstemmed Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title_short Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
title_sort indel sensitive and comprehensive variant/mutation detection from rna sequencing data for precision medicine
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157028/
https://www.ncbi.nlm.nih.gov/pubmed/30255803
http://dx.doi.org/10.1186/s12920-018-0391-5
work_keys_str_mv AT prodduturinaresh indelsensitiveandcomprehensivevariantmutationdetectionfromrnasequencingdataforprecisionmedicine
AT bhagwateaditya indelsensitiveandcomprehensivevariantmutationdetectionfromrnasequencingdataforprecisionmedicine
AT kocherjeanpierrea indelsensitiveandcomprehensivevariantmutationdetectionfromrnasequencingdataforprecisionmedicine
AT sunzhifu indelsensitiveandcomprehensivevariantmutationdetectionfromrnasequencingdataforprecisionmedicine