Cargando…

A simple guide to de novo transcriptome assembly and annotation

A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assemble...

Descripción completa

Detalles Bibliográficos
Autores principales: Raghavan, Venket, Kraft, Louis, Mesny, Fantin, Rigerte, Linda
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921630/
https://www.ncbi.nlm.nih.gov/pubmed/35076693
http://dx.doi.org/10.1093/bib/bbab563
_version_ 1784669361505566720
author Raghavan, Venket
Kraft, Louis
Mesny, Fantin
Rigerte, Linda
author_facet Raghavan, Venket
Kraft, Louis
Mesny, Fantin
Rigerte, Linda
author_sort Raghavan, Venket
collection PubMed
description A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
format Online
Article
Text
id pubmed-8921630
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89216302022-03-15 A simple guide to de novo transcriptome assembly and annotation Raghavan, Venket Kraft, Louis Mesny, Fantin Rigerte, Linda Brief Bioinform Review A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. Oxford University Press 2022-01-24 /pmc/articles/PMC8921630/ /pubmed/35076693 http://dx.doi.org/10.1093/bib/bbab563 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review
Raghavan, Venket
Kraft, Louis
Mesny, Fantin
Rigerte, Linda
A simple guide to de novo transcriptome assembly and annotation
title A simple guide to de novo transcriptome assembly and annotation
title_full A simple guide to de novo transcriptome assembly and annotation
title_fullStr A simple guide to de novo transcriptome assembly and annotation
title_full_unstemmed A simple guide to de novo transcriptome assembly and annotation
title_short A simple guide to de novo transcriptome assembly and annotation
title_sort simple guide to de novo transcriptome assembly and annotation
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921630/
https://www.ncbi.nlm.nih.gov/pubmed/35076693
http://dx.doi.org/10.1093/bib/bbab563
work_keys_str_mv AT raghavanvenket asimpleguidetodenovotranscriptomeassemblyandannotation
AT kraftlouis asimpleguidetodenovotranscriptomeassemblyandannotation
AT mesnyfantin asimpleguidetodenovotranscriptomeassemblyandannotation
AT rigertelinda asimpleguidetodenovotranscriptomeassemblyandannotation
AT raghavanvenket simpleguidetodenovotranscriptomeassemblyandannotation
AT kraftlouis simpleguidetodenovotranscriptomeassemblyandannotation
AT mesnyfantin simpleguidetodenovotranscriptomeassemblyandannotation
AT rigertelinda simpleguidetodenovotranscriptomeassemblyandannotation