Cargando…

Long-read transcriptome data for improved gene prediction in Lentinula edodes

Lentinula edodes is one of the most popular edible mushrooms in the world and contains useful medicinal components such as lentinan. The whole-genome sequence of L. edodes has been determined with the objective of discovering candidate genes associated with agronomic traits, but experimental verific...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Sin-Gi, Yoo, Seung il, Ryu, Dong Sung, Lee, Hyunsung, Ahn, Yong Ju, Ryu, Hojin, Ko, Junsu, Hong, Chang Pyo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961913/
https://www.ncbi.nlm.nih.gov/pubmed/29845094
http://dx.doi.org/10.1016/j.dib.2017.09.052
_version_ 1783324808251965440
author Park, Sin-Gi
Yoo, Seung il
Ryu, Dong Sung
Lee, Hyunsung
Ahn, Yong Ju
Ryu, Hojin
Ko, Junsu
Hong, Chang Pyo
author_facet Park, Sin-Gi
Yoo, Seung il
Ryu, Dong Sung
Lee, Hyunsung
Ahn, Yong Ju
Ryu, Hojin
Ko, Junsu
Hong, Chang Pyo
author_sort Park, Sin-Gi
collection PubMed
description Lentinula edodes is one of the most popular edible mushrooms in the world and contains useful medicinal components such as lentinan. The whole-genome sequence of L. edodes has been determined with the objective of discovering candidate genes associated with agronomic traits, but experimental verification of gene models with correction of gene prediction errors is lacking. To improve the accuracy of gene prediction, we produced 12.6 Gb of long-read transcriptome data of variable lengths using PacBio single-molecule real-time (SMRT) sequencing and generated 36,946 transcript clusters with an average length of 2.2 kb. Evidence-driven gene prediction on the basis of long- and short-read RNA sequencing data was performed; a total of 16,610 protein-coding genes were predicted with error correction. Of the predicted genes, 42.2% were verified to be covered by full-length transcript clusters. The raw reads have been deposited in the NCBI SRA database under accession number PRJNA396788.
format Online
Article
Text
id pubmed-5961913
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-59619132018-05-29 Long-read transcriptome data for improved gene prediction in Lentinula edodes Park, Sin-Gi Yoo, Seung il Ryu, Dong Sung Lee, Hyunsung Ahn, Yong Ju Ryu, Hojin Ko, Junsu Hong, Chang Pyo Data Brief Genetics, Genomics and Molecular Biology Lentinula edodes is one of the most popular edible mushrooms in the world and contains useful medicinal components such as lentinan. The whole-genome sequence of L. edodes has been determined with the objective of discovering candidate genes associated with agronomic traits, but experimental verification of gene models with correction of gene prediction errors is lacking. To improve the accuracy of gene prediction, we produced 12.6 Gb of long-read transcriptome data of variable lengths using PacBio single-molecule real-time (SMRT) sequencing and generated 36,946 transcript clusters with an average length of 2.2 kb. Evidence-driven gene prediction on the basis of long- and short-read RNA sequencing data was performed; a total of 16,610 protein-coding genes were predicted with error correction. Of the predicted genes, 42.2% were verified to be covered by full-length transcript clusters. The raw reads have been deposited in the NCBI SRA database under accession number PRJNA396788. Elsevier 2017-09-27 /pmc/articles/PMC5961913/ /pubmed/29845094 http://dx.doi.org/10.1016/j.dib.2017.09.052 Text en © 2017 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Genetics, Genomics and Molecular Biology
Park, Sin-Gi
Yoo, Seung il
Ryu, Dong Sung
Lee, Hyunsung
Ahn, Yong Ju
Ryu, Hojin
Ko, Junsu
Hong, Chang Pyo
Long-read transcriptome data for improved gene prediction in Lentinula edodes
title Long-read transcriptome data for improved gene prediction in Lentinula edodes
title_full Long-read transcriptome data for improved gene prediction in Lentinula edodes
title_fullStr Long-read transcriptome data for improved gene prediction in Lentinula edodes
title_full_unstemmed Long-read transcriptome data for improved gene prediction in Lentinula edodes
title_short Long-read transcriptome data for improved gene prediction in Lentinula edodes
title_sort long-read transcriptome data for improved gene prediction in lentinula edodes
topic Genetics, Genomics and Molecular Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961913/
https://www.ncbi.nlm.nih.gov/pubmed/29845094
http://dx.doi.org/10.1016/j.dib.2017.09.052
work_keys_str_mv AT parksingi longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT yooseungil longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT ryudongsung longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT leehyunsung longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT ahnyongju longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT ryuhojin longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT kojunsu longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes
AT hongchangpyo longreadtranscriptomedataforimprovedgenepredictioninlentinulaedodes