Cargando…

Multiple model species selection for transcriptomics analysis of non-model organisms

BACKGROUND: Transcriptomic sequencing (RNA-seq) related applications allow for rapid explorations due to their high-throughput and relatively fast experimental capabilities, providing unprecedented progress in gene functional annotation, gene regulation analysis, and environmental factor verificatio...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pai, Tun-Wen, Li, Kuan-Hung, Yang, Cing-Han, Hu, Chin-Hwa, Lin, Han-Jia, Wang, Wen-Der, Chen, Yet-Ran
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101069/ https://www.ncbi.nlm.nih.gov/pubmed/30367568 http://dx.doi.org/10.1186/s12859-018-2278-z

_version_	1783348981331394560
author	Pai, Tun-Wen Li, Kuan-Hung Yang, Cing-Han Hu, Chin-Hwa Lin, Han-Jia Wang, Wen-Der Chen, Yet-Ran
author_facet	Pai, Tun-Wen Li, Kuan-Hung Yang, Cing-Han Hu, Chin-Hwa Lin, Han-Jia Wang, Wen-Der Chen, Yet-Ran
author_sort	Pai, Tun-Wen
collection	PubMed
description	BACKGROUND: Transcriptomic sequencing (RNA-seq) related applications allow for rapid explorations due to their high-throughput and relatively fast experimental capabilities, providing unprecedented progress in gene functional annotation, gene regulation analysis, and environmental factor verification. However, with increasing amounts of sequenced reads and reference model species, the selection of appropriate reference species for gene annotation has become a new challenge. METHODS: We proposed a novel approach for finding the most effective reference model species through taxonomic associations and ultra-conserved orthologous (UCO) gene comparisons among species. An online system for multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and comprehensive genomic annotations from 291 reference model eukaryotic species were retrieved from the RefSeq, KEGG, and UniProt databases. RESULTS: Using the proposed MSS pipeline, gene ontology and biological pathway enrichment analysis can be efficiently achieved, especially in the case of transcriptomic analysis of non-model organisms. The results showed that the proposed method solved problems related to limitations in annotation information and provided a roughly twenty-fold reduction in computational time, resulting in more accurate results than those of traditional approaches of using a single model reference species or the large non-redundant reference database. CONCLUSIONS: Selection of appropriate reference model species helps to reduce missing annotation information, allowing for more comprehensive results than those obtained with a single model reference species. In addition, adequate model species selection reduces the computational time significantly while retaining the same order of accuracy. The proposed system indeed provides superior performance by selecting appropriate multiple species for transcriptomic analysis compared to traditional approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2278-z) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6101069
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-61010692018-08-27 Multiple model species selection for transcriptomics analysis of non-model organisms Pai, Tun-Wen Li, Kuan-Hung Yang, Cing-Han Hu, Chin-Hwa Lin, Han-Jia Wang, Wen-Der Chen, Yet-Ran BMC Bioinformatics Research BACKGROUND: Transcriptomic sequencing (RNA-seq) related applications allow for rapid explorations due to their high-throughput and relatively fast experimental capabilities, providing unprecedented progress in gene functional annotation, gene regulation analysis, and environmental factor verification. However, with increasing amounts of sequenced reads and reference model species, the selection of appropriate reference species for gene annotation has become a new challenge. METHODS: We proposed a novel approach for finding the most effective reference model species through taxonomic associations and ultra-conserved orthologous (UCO) gene comparisons among species. An online system for multiple species selection (MSS) for RNA-seq differential expression analysis was developed, and comprehensive genomic annotations from 291 reference model eukaryotic species were retrieved from the RefSeq, KEGG, and UniProt databases. RESULTS: Using the proposed MSS pipeline, gene ontology and biological pathway enrichment analysis can be efficiently achieved, especially in the case of transcriptomic analysis of non-model organisms. The results showed that the proposed method solved problems related to limitations in annotation information and provided a roughly twenty-fold reduction in computational time, resulting in more accurate results than those of traditional approaches of using a single model reference species or the large non-redundant reference database. CONCLUSIONS: Selection of appropriate reference model species helps to reduce missing annotation information, allowing for more comprehensive results than those obtained with a single model reference species. In addition, adequate model species selection reduces the computational time significantly while retaining the same order of accuracy. The proposed system indeed provides superior performance by selecting appropriate multiple species for transcriptomic analysis compared to traditional approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2278-z) contains supplementary material, which is available to authorized users. BioMed Central 2018-08-13 /pmc/articles/PMC6101069/ /pubmed/30367568 http://dx.doi.org/10.1186/s12859-018-2278-z Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Pai, Tun-Wen Li, Kuan-Hung Yang, Cing-Han Hu, Chin-Hwa Lin, Han-Jia Wang, Wen-Der Chen, Yet-Ran Multiple model species selection for transcriptomics analysis of non-model organisms
title	Multiple model species selection for transcriptomics analysis of non-model organisms
title_full	Multiple model species selection for transcriptomics analysis of non-model organisms
title_fullStr	Multiple model species selection for transcriptomics analysis of non-model organisms
title_full_unstemmed	Multiple model species selection for transcriptomics analysis of non-model organisms
title_short	Multiple model species selection for transcriptomics analysis of non-model organisms
title_sort	multiple model species selection for transcriptomics analysis of non-model organisms
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6101069/ https://www.ncbi.nlm.nih.gov/pubmed/30367568 http://dx.doi.org/10.1186/s12859-018-2278-z
work_keys_str_mv	AT paitunwen multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT likuanhung multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT yangcinghan multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT huchinhwa multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT linhanjia multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT wangwender multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms AT chenyetran multiplemodelspeciesselectionfortranscriptomicsanalysisofnonmodelorganisms

Multiple model species selection for transcriptomics analysis of non-model organisms

Ejemplares similares