Cargando…
isONform: reference-free transcriptome reconstruction from Oxford Nanopore data
MOTIVATION: With advances in long-read transcriptome sequencing, we can now fully sequence transcripts, which greatly improves our ability to study transcription processes. A popular long-read transcriptome sequencing technique is Oxford Nanopore Technologies (ONT), which through its cost-effective...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311309/ https://www.ncbi.nlm.nih.gov/pubmed/37387174 http://dx.doi.org/10.1093/bioinformatics/btad264 |
_version_ | 1785066715831336960 |
---|---|
author | Petri, Alexander J Sahlin, Kristoffer |
author_facet | Petri, Alexander J Sahlin, Kristoffer |
author_sort | Petri, Alexander J |
collection | PubMed |
description | MOTIVATION: With advances in long-read transcriptome sequencing, we can now fully sequence transcripts, which greatly improves our ability to study transcription processes. A popular long-read transcriptome sequencing technique is Oxford Nanopore Technologies (ONT), which through its cost-effective sequencing and high throughput, has the potential to characterize the transcriptome in a cell. However, due to transcript variability and sequencing errors, long cDNA reads need substantial bioinformatic processing to produce a set of isoform predictions from the reads. Several genome and annotation-based methods exist to produce transcript predictions. However, such methods require high-quality genomes and annotations and are limited by the accuracy of long-read splice aligners. In addition, gene families with high heterogeneity may not be well represented by a reference genome and would benefit from reference-free analysis. Reference-free methods to predict transcripts from ONT, such as RATTLE, exist, but their sensitivity is not comparable to reference-based approaches. RESULTS: We present isONform, a high-sensitivity algorithm to construct isoforms from ONT cDNA sequencing data. The algorithm is based on iterative bubble popping on gene graphs built from fuzzy seeds from the reads. Using simulated, synthetic, and biological ONT cDNA data, we show that isONform has substantially higher sensitivity than RATTLE albeit with some loss in precision. On biological data, we show that isONform’s predictions have substantially higher consistency with the annotation-based method StringTie2 compared with RATTLE. We believe isONform can be used both for isoform construction for organisms without well-annotated genomes and as an orthogonal method to verify predictions of reference-based methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/aljpetri/isONform |
format | Online Article Text |
id | pubmed-10311309 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103113092023-07-01 isONform: reference-free transcriptome reconstruction from Oxford Nanopore data Petri, Alexander J Sahlin, Kristoffer Bioinformatics Genome Sequence Analysis MOTIVATION: With advances in long-read transcriptome sequencing, we can now fully sequence transcripts, which greatly improves our ability to study transcription processes. A popular long-read transcriptome sequencing technique is Oxford Nanopore Technologies (ONT), which through its cost-effective sequencing and high throughput, has the potential to characterize the transcriptome in a cell. However, due to transcript variability and sequencing errors, long cDNA reads need substantial bioinformatic processing to produce a set of isoform predictions from the reads. Several genome and annotation-based methods exist to produce transcript predictions. However, such methods require high-quality genomes and annotations and are limited by the accuracy of long-read splice aligners. In addition, gene families with high heterogeneity may not be well represented by a reference genome and would benefit from reference-free analysis. Reference-free methods to predict transcripts from ONT, such as RATTLE, exist, but their sensitivity is not comparable to reference-based approaches. RESULTS: We present isONform, a high-sensitivity algorithm to construct isoforms from ONT cDNA sequencing data. The algorithm is based on iterative bubble popping on gene graphs built from fuzzy seeds from the reads. Using simulated, synthetic, and biological ONT cDNA data, we show that isONform has substantially higher sensitivity than RATTLE albeit with some loss in precision. On biological data, we show that isONform’s predictions have substantially higher consistency with the annotation-based method StringTie2 compared with RATTLE. We believe isONform can be used both for isoform construction for organisms without well-annotated genomes and as an orthogonal method to verify predictions of reference-based methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/aljpetri/isONform Oxford University Press 2023-06-30 /pmc/articles/PMC10311309/ /pubmed/37387174 http://dx.doi.org/10.1093/bioinformatics/btad264 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genome Sequence Analysis Petri, Alexander J Sahlin, Kristoffer isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title | isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title_full | isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title_fullStr | isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title_full_unstemmed | isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title_short | isONform: reference-free transcriptome reconstruction from Oxford Nanopore data |
title_sort | isonform: reference-free transcriptome reconstruction from oxford nanopore data |
topic | Genome Sequence Analysis |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311309/ https://www.ncbi.nlm.nih.gov/pubmed/37387174 http://dx.doi.org/10.1093/bioinformatics/btad264 |
work_keys_str_mv | AT petrialexanderj isonformreferencefreetranscriptomereconstructionfromoxfordnanoporedata AT sahlinkristoffer isonformreferencefreetranscriptomereconstructionfromoxfordnanoporedata |