Cargando…
A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis
BACKGROUND: Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative s...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9264592/ https://www.ncbi.nlm.nih.gov/pubmed/35799267 http://dx.doi.org/10.1186/s13059-022-02711-0 |
_version_ | 1784742996759019520 |
---|---|
author | Zhang, Runxuan Kuo, Richard Coulter, Max Calixto, Cristiane P. G. Entizne, Juan Carlos Guo, Wenbin Marquez, Yamile Milne, Linda Riegler, Stefan Matsui, Akihiro Tanaka, Maho Harvey, Sarah Gao, Yubang Wießner-Kroh, Theresa Paniagua, Alejandro Crespi, Martin Denby, Katherine Hur, Asa ben Huq, Enamul Jantsch, Michael Jarmolowski, Artur Koester, Tino Laubinger, Sascha Li, Qingshun Quinn Gu, Lianfeng Seki, Motoaki Staiger, Dorothee Sunkar, Ramanjulu Szweykowska-Kulinska, Zofia Tu, Shih-Long Wachter, Andreas Waugh, Robbie Xiong, Liming Zhang, Xiao-Ning Conesa, Ana Reddy, Anireddy S. N. Barta, Andrea Kalyna, Maria Brown, John W. S. |
author_facet | Zhang, Runxuan Kuo, Richard Coulter, Max Calixto, Cristiane P. G. Entizne, Juan Carlos Guo, Wenbin Marquez, Yamile Milne, Linda Riegler, Stefan Matsui, Akihiro Tanaka, Maho Harvey, Sarah Gao, Yubang Wießner-Kroh, Theresa Paniagua, Alejandro Crespi, Martin Denby, Katherine Hur, Asa ben Huq, Enamul Jantsch, Michael Jarmolowski, Artur Koester, Tino Laubinger, Sascha Li, Qingshun Quinn Gu, Lianfeng Seki, Motoaki Staiger, Dorothee Sunkar, Ramanjulu Szweykowska-Kulinska, Zofia Tu, Shih-Long Wachter, Andreas Waugh, Robbie Xiong, Liming Zhang, Xiao-Ning Conesa, Ana Reddy, Anireddy S. N. Barta, Andrea Kalyna, Maria Brown, John W. S. |
author_sort | Zhang, Runxuan |
collection | PubMed |
description | BACKGROUND: Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. RESULTS: We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts—twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. CONCLUSIONS: AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02711-0. |
format | Online Article Text |
id | pubmed-9264592 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-92645922022-07-09 A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis Zhang, Runxuan Kuo, Richard Coulter, Max Calixto, Cristiane P. G. Entizne, Juan Carlos Guo, Wenbin Marquez, Yamile Milne, Linda Riegler, Stefan Matsui, Akihiro Tanaka, Maho Harvey, Sarah Gao, Yubang Wießner-Kroh, Theresa Paniagua, Alejandro Crespi, Martin Denby, Katherine Hur, Asa ben Huq, Enamul Jantsch, Michael Jarmolowski, Artur Koester, Tino Laubinger, Sascha Li, Qingshun Quinn Gu, Lianfeng Seki, Motoaki Staiger, Dorothee Sunkar, Ramanjulu Szweykowska-Kulinska, Zofia Tu, Shih-Long Wachter, Andreas Waugh, Robbie Xiong, Liming Zhang, Xiao-Ning Conesa, Ana Reddy, Anireddy S. N. Barta, Andrea Kalyna, Maria Brown, John W. S. Genome Biol Research BACKGROUND: Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. RESULTS: We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts—twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. CONCLUSIONS: AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02711-0. BioMed Central 2022-07-07 /pmc/articles/PMC9264592/ /pubmed/35799267 http://dx.doi.org/10.1186/s13059-022-02711-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zhang, Runxuan Kuo, Richard Coulter, Max Calixto, Cristiane P. G. Entizne, Juan Carlos Guo, Wenbin Marquez, Yamile Milne, Linda Riegler, Stefan Matsui, Akihiro Tanaka, Maho Harvey, Sarah Gao, Yubang Wießner-Kroh, Theresa Paniagua, Alejandro Crespi, Martin Denby, Katherine Hur, Asa ben Huq, Enamul Jantsch, Michael Jarmolowski, Artur Koester, Tino Laubinger, Sascha Li, Qingshun Quinn Gu, Lianfeng Seki, Motoaki Staiger, Dorothee Sunkar, Ramanjulu Szweykowska-Kulinska, Zofia Tu, Shih-Long Wachter, Andreas Waugh, Robbie Xiong, Liming Zhang, Xiao-Ning Conesa, Ana Reddy, Anireddy S. N. Barta, Andrea Kalyna, Maria Brown, John W. S. A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title | A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title_full | A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title_fullStr | A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title_full_unstemmed | A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title_short | A high-resolution single-molecule sequencing-based Arabidopsis transcriptome using novel methods of Iso-seq analysis |
title_sort | high-resolution single-molecule sequencing-based arabidopsis transcriptome using novel methods of iso-seq analysis |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9264592/ https://www.ncbi.nlm.nih.gov/pubmed/35799267 http://dx.doi.org/10.1186/s13059-022-02711-0 |
work_keys_str_mv | AT zhangrunxuan ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT kuorichard ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT coultermax ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT calixtocristianepg ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT entiznejuancarlos ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT guowenbin ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT marquezyamile ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT milnelinda ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT rieglerstefan ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT matsuiakihiro ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT tanakamaho ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT harveysarah ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT gaoyubang ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT wießnerkrohtheresa ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT paniaguaalejandro ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT crespimartin ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT denbykatherine ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT hurasaben ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT huqenamul ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT jantschmichael ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT jarmolowskiartur ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT koestertino ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT laubingersascha ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT liqingshunquinn ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT gulianfeng ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT sekimotoaki ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT staigerdorothee ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT sunkarramanjulu ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT szweykowskakulinskazofia ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT tushihlong ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT wachterandreas ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT waughrobbie ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT xiongliming ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT zhangxiaoning ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT conesaana ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT reddyanireddysn ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT bartaandrea ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT kalynamaria ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT brownjohnws ahighresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT zhangrunxuan highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT kuorichard highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT coultermax highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT calixtocristianepg highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT entiznejuancarlos highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT guowenbin highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT marquezyamile highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT milnelinda highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT rieglerstefan highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT matsuiakihiro highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT tanakamaho highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT harveysarah highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT gaoyubang highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT wießnerkrohtheresa highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT paniaguaalejandro highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT crespimartin highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT denbykatherine highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT hurasaben highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT huqenamul highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT jantschmichael highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT jarmolowskiartur highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT koestertino highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT laubingersascha highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT liqingshunquinn highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT gulianfeng highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT sekimotoaki highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT staigerdorothee highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT sunkarramanjulu highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT szweykowskakulinskazofia highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT tushihlong highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT wachterandreas highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT waughrobbie highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT xiongliming highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT zhangxiaoning highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT conesaana highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT reddyanireddysn highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT bartaandrea highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT kalynamaria highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis AT brownjohnws highresolutionsinglemoleculesequencingbasedarabidopsistranscriptomeusingnovelmethodsofisoseqanalysis |