Cargando…

CAFU: a Galaxy framework for exploring unmapped RNA-Seq data

A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Siyuan, Ren, Chengzhi, Zhai, Jingjing, Yu, Jiantao, Zhao, Xuyang, Li, Zelong, Zhang, Ting, Ma, Wenlong, Han, Zhaoxue, Ma, Chuang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299299/
https://www.ncbi.nlm.nih.gov/pubmed/30815667
http://dx.doi.org/10.1093/bib/bbz018
_version_ 1783547358561173504
author Chen, Siyuan
Ren, Chengzhi
Zhai, Jingjing
Yu, Jiantao
Zhao, Xuyang
Li, Zelong
Zhang, Ting
Ma, Wenlong
Han, Zhaoxue
Ma, Chuang
author_facet Chen, Siyuan
Ren, Chengzhi
Zhai, Jingjing
Yu, Jiantao
Zhao, Xuyang
Li, Zelong
Zhang, Ting
Ma, Wenlong
Han, Zhaoxue
Ma, Chuang
author_sort Chen, Siyuan
collection PubMed
description A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.
format Online
Article
Text
id pubmed-7299299
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72992992020-06-22 CAFU: a Galaxy framework for exploring unmapped RNA-Seq data Chen, Siyuan Ren, Chengzhi Zhai, Jingjing Yu, Jiantao Zhao, Xuyang Li, Zelong Zhang, Ting Ma, Wenlong Han, Zhaoxue Ma, Chuang Brief Bioinform Problem Solving Protocol A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU. Oxford University Press 2019-02-28 /pmc/articles/PMC7299299/ /pubmed/30815667 http://dx.doi.org/10.1093/bib/bbz018 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Chen, Siyuan
Ren, Chengzhi
Zhai, Jingjing
Yu, Jiantao
Zhao, Xuyang
Li, Zelong
Zhang, Ting
Ma, Wenlong
Han, Zhaoxue
Ma, Chuang
CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title_full CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title_fullStr CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title_full_unstemmed CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title_short CAFU: a Galaxy framework for exploring unmapped RNA-Seq data
title_sort cafu: a galaxy framework for exploring unmapped rna-seq data
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299299/
https://www.ncbi.nlm.nih.gov/pubmed/30815667
http://dx.doi.org/10.1093/bib/bbz018
work_keys_str_mv AT chensiyuan cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT renchengzhi cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT zhaijingjing cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT yujiantao cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT zhaoxuyang cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT lizelong cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT zhangting cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT mawenlong cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT hanzhaoxue cafuagalaxyframeworkforexploringunmappedrnaseqdata
AT machuang cafuagalaxyframeworkforexploringunmappedrnaseqdata