Cargando…

Exploiting single-molecule transcript sequencing for eukaryotic gene prediction

We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Minoche, André E., Dohm, Juliane C., Schneider, Jessica, Holtgräwe, Daniela, Viehöver, Prisca, Montfort, Magda, Rosleff Sörensen, Thomas, Weisshaar, Bernd, Himmelbauer, Heinz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556409/
https://www.ncbi.nlm.nih.gov/pubmed/26328666
http://dx.doi.org/10.1186/s13059-015-0729-7
_version_ 1782388346533707776
author Minoche, André E.
Dohm, Juliane C.
Schneider, Jessica
Holtgräwe, Daniela
Viehöver, Prisca
Montfort, Magda
Rosleff Sörensen, Thomas
Weisshaar, Bernd
Himmelbauer, Heinz
author_facet Minoche, André E.
Dohm, Juliane C.
Schneider, Jessica
Holtgräwe, Daniela
Viehöver, Prisca
Montfort, Magda
Rosleff Sörensen, Thomas
Weisshaar, Bernd
Himmelbauer, Heinz
author_sort Minoche, André E.
collection PubMed
description We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0729-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4556409
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45564092015-09-02 Exploiting single-molecule transcript sequencing for eukaryotic gene prediction Minoche, André E. Dohm, Juliane C. Schneider, Jessica Holtgräwe, Daniela Viehöver, Prisca Montfort, Magda Rosleff Sörensen, Thomas Weisshaar, Bernd Himmelbauer, Heinz Genome Biol Method We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0729-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-09-02 2015 /pmc/articles/PMC4556409/ /pubmed/26328666 http://dx.doi.org/10.1186/s13059-015-0729-7 Text en © Minoche et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
Minoche, André E.
Dohm, Juliane C.
Schneider, Jessica
Holtgräwe, Daniela
Viehöver, Prisca
Montfort, Magda
Rosleff Sörensen, Thomas
Weisshaar, Bernd
Himmelbauer, Heinz
Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title_full Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title_fullStr Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title_full_unstemmed Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title_short Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
title_sort exploiting single-molecule transcript sequencing for eukaryotic gene prediction
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556409/
https://www.ncbi.nlm.nih.gov/pubmed/26328666
http://dx.doi.org/10.1186/s13059-015-0729-7
work_keys_str_mv AT minocheandree exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT dohmjulianec exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT schneiderjessica exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT holtgrawedaniela exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT viehoverprisca exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT montfortmagda exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT rosleffsorensenthomas exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT weisshaarbernd exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction
AT himmelbauerheinz exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction