Cargando…

PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms

BACKGROUND: Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organis...

Descripción completa

Detalles Bibliográficos
Autores principales: Gan, Ruei-Chi, Chen, Ting-Wen, Wu, Timothy H., Huang, Po-Jung, Lee, Chi-Ching, Yeh, Yuan-Ming, Chiu, Cheng-Hsun, Huang, Hsien-Da, Tang, Petrus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260104/
https://www.ncbi.nlm.nih.gov/pubmed/28155708
http://dx.doi.org/10.1186/s12859-016-1366-1
_version_ 1782499344084107264
author Gan, Ruei-Chi
Chen, Ting-Wen
Wu, Timothy H.
Huang, Po-Jung
Lee, Chi-Ching
Yeh, Yuan-Ming
Chiu, Cheng-Hsun
Huang, Hsien-Da
Tang, Petrus
author_facet Gan, Ruei-Chi
Chen, Ting-Wen
Wu, Timothy H.
Huang, Po-Jung
Lee, Chi-Ching
Yeh, Yuan-Ming
Chiu, Cheng-Hsun
Huang, Hsien-Da
Tang, Petrus
author_sort Gan, Ruei-Chi
collection PubMed
description BACKGROUND: Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. RESULTS: Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. CONCLUSIONS: In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw.
format Online
Article
Text
id pubmed-5260104
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52601042017-01-26 PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms Gan, Ruei-Chi Chen, Ting-Wen Wu, Timothy H. Huang, Po-Jung Lee, Chi-Ching Yeh, Yuan-Ming Chiu, Cheng-Hsun Huang, Hsien-Da Tang, Petrus BMC Bioinformatics Software BACKGROUND: Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. RESULTS: Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. CONCLUSIONS: In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw. BioMed Central 2016-12-22 /pmc/articles/PMC5260104/ /pubmed/28155708 http://dx.doi.org/10.1186/s12859-016-1366-1 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Gan, Ruei-Chi
Chen, Ting-Wen
Wu, Timothy H.
Huang, Po-Jung
Lee, Chi-Ching
Yeh, Yuan-Ming
Chiu, Cheng-Hsun
Huang, Hsien-Da
Tang, Petrus
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title_full PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title_fullStr PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title_full_unstemmed PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title_short PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms
title_sort parrot- a homology-based strategy to quantify and compare rna-sequencing from non-model organisms
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260104/
https://www.ncbi.nlm.nih.gov/pubmed/28155708
http://dx.doi.org/10.1186/s12859-016-1366-1
work_keys_str_mv AT ganrueichi parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT chentingwen parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT wutimothyh parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT huangpojung parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT leechiching parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT yehyuanming parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT chiuchenghsun parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT huanghsienda parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms
AT tangpetrus parrotahomologybasedstrategytoquantifyandcomparernasequencingfromnonmodelorganisms