Cargando…

P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads

BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Bai-Han, Xiao, Jun, Xue, Wei, Xu, Gui-Cai, Sun, Ming-Yuan, Li, Jiong-Tang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5834899/
https://www.ncbi.nlm.nih.gov/pubmed/29499650
http://dx.doi.org/10.1186/s12864-018-4567-3
_version_ 1783303732743634944
author Zhu, Bai-Han
Xiao, Jun
Xue, Wei
Xu, Gui-Cai
Sun, Ming-Yuan
Li, Jiong-Tang
author_facet Zhu, Bai-Han
Xiao, Jun
Xue, Wei
Xu, Gui-Cai
Sun, Ming-Yuan
Li, Jiong-Tang
author_sort Zhu, Bai-Han
collection PubMed
description BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions. RESULTS: We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly. CONCLUSIONS: The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4567-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5834899
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58348992018-03-05 P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads Zhu, Bai-Han Xiao, Jun Xue, Wei Xu, Gui-Cai Sun, Ming-Yuan Li, Jiong-Tang BMC Genomics Methodology Article BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions. RESULTS: We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly. CONCLUSIONS: The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4567-3) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-02 /pmc/articles/PMC5834899/ /pubmed/29499650 http://dx.doi.org/10.1186/s12864-018-4567-3 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Zhu, Bai-Han
Xiao, Jun
Xue, Wei
Xu, Gui-Cai
Sun, Ming-Yuan
Li, Jiong-Tang
P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title_full P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title_fullStr P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title_full_unstemmed P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title_short P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
title_sort p_rna_scaffolder: a fast and accurate genome scaffolder using paired-end rna-sequencing reads
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5834899/
https://www.ncbi.nlm.nih.gov/pubmed/29499650
http://dx.doi.org/10.1186/s12864-018-4567-3
work_keys_str_mv AT zhubaihan prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads
AT xiaojun prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads
AT xuewei prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads
AT xuguicai prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads
AT sunmingyuan prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads
AT lijiongtang prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads