Cargando…
Exploiting single-molecule transcript sequencing for eukaryotic gene prediction
We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and predic...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556409/ https://www.ncbi.nlm.nih.gov/pubmed/26328666 http://dx.doi.org/10.1186/s13059-015-0729-7 |
_version_ | 1782388346533707776 |
---|---|
author | Minoche, André E. Dohm, Juliane C. Schneider, Jessica Holtgräwe, Daniela Viehöver, Prisca Montfort, Magda Rosleff Sörensen, Thomas Weisshaar, Bernd Himmelbauer, Heinz |
author_facet | Minoche, André E. Dohm, Juliane C. Schneider, Jessica Holtgräwe, Daniela Viehöver, Prisca Montfort, Magda Rosleff Sörensen, Thomas Weisshaar, Bernd Himmelbauer, Heinz |
author_sort | Minoche, André E. |
collection | PubMed |
description | We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0729-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4556409 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45564092015-09-02 Exploiting single-molecule transcript sequencing for eukaryotic gene prediction Minoche, André E. Dohm, Juliane C. Schneider, Jessica Holtgräwe, Daniela Viehöver, Prisca Montfort, Magda Rosleff Sörensen, Thomas Weisshaar, Bernd Himmelbauer, Heinz Genome Biol Method We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0729-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-09-02 2015 /pmc/articles/PMC4556409/ /pubmed/26328666 http://dx.doi.org/10.1186/s13059-015-0729-7 Text en © Minoche et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Method Minoche, André E. Dohm, Juliane C. Schneider, Jessica Holtgräwe, Daniela Viehöver, Prisca Montfort, Magda Rosleff Sörensen, Thomas Weisshaar, Bernd Himmelbauer, Heinz Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title | Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title_full | Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title_fullStr | Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title_full_unstemmed | Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title_short | Exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
title_sort | exploiting single-molecule transcript sequencing for eukaryotic gene prediction |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556409/ https://www.ncbi.nlm.nih.gov/pubmed/26328666 http://dx.doi.org/10.1186/s13059-015-0729-7 |
work_keys_str_mv | AT minocheandree exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT dohmjulianec exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT schneiderjessica exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT holtgrawedaniela exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT viehoverprisca exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT montfortmagda exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT rosleffsorensenthomas exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT weisshaarbernd exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction AT himmelbauerheinz exploitingsinglemoleculetranscriptsequencingforeukaryoticgeneprediction |