Cargando…
P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads
BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. Th...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5834899/ https://www.ncbi.nlm.nih.gov/pubmed/29499650 http://dx.doi.org/10.1186/s12864-018-4567-3 |
_version_ | 1783303732743634944 |
---|---|
author | Zhu, Bai-Han Xiao, Jun Xue, Wei Xu, Gui-Cai Sun, Ming-Yuan Li, Jiong-Tang |
author_facet | Zhu, Bai-Han Xiao, Jun Xue, Wei Xu, Gui-Cai Sun, Ming-Yuan Li, Jiong-Tang |
author_sort | Zhu, Bai-Han |
collection | PubMed |
description | BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions. RESULTS: We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly. CONCLUSIONS: The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4567-3) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5834899 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-58348992018-03-05 P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads Zhu, Bai-Han Xiao, Jun Xue, Wei Xu, Gui-Cai Sun, Ming-Yuan Li, Jiong-Tang BMC Genomics Methodology Article BACKGROUND: Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions. RESULTS: We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly. CONCLUSIONS: The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-4567-3) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-02 /pmc/articles/PMC5834899/ /pubmed/29499650 http://dx.doi.org/10.1186/s12864-018-4567-3 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Zhu, Bai-Han Xiao, Jun Xue, Wei Xu, Gui-Cai Sun, Ming-Yuan Li, Jiong-Tang P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title | P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title_full | P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title_fullStr | P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title_full_unstemmed | P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title_short | P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads |
title_sort | p_rna_scaffolder: a fast and accurate genome scaffolder using paired-end rna-sequencing reads |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5834899/ https://www.ncbi.nlm.nih.gov/pubmed/29499650 http://dx.doi.org/10.1186/s12864-018-4567-3 |
work_keys_str_mv | AT zhubaihan prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads AT xiaojun prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads AT xuewei prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads AT xuguicai prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads AT sunmingyuan prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads AT lijiongtang prnascaffolderafastandaccurategenomescaffolderusingpairedendrnasequencingreads |