Cargando…

TransRate: reference-free quality assessment of de novo transcriptome assemblies

TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomp...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith-Unna, Richard, Boursnell, Chris, Patro, Rob, Hibberd, Julian M., Kelly, Steven
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971766/
https://www.ncbi.nlm.nih.gov/pubmed/27252236
http://dx.doi.org/10.1101/gr.196469.115
_version_ 1782446167288709120
author Smith-Unna, Richard
Boursnell, Chris
Patro, Rob
Hibberd, Julian M.
Kelly, Steven
author_facet Smith-Unna, Richard
Boursnell, Chris
Patro, Rob
Hibberd, Julian M.
Kelly, Steven
author_sort Smith-Unna, Richard
collection PubMed
description TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomplete assembly, and base errors. TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies. Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads. Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies. Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples.
format Online
Article
Text
id pubmed-4971766
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-49717662016-08-25 TransRate: reference-free quality assessment of de novo transcriptome assemblies Smith-Unna, Richard Boursnell, Chris Patro, Rob Hibberd, Julian M. Kelly, Steven Genome Res Method TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomplete assembly, and base errors. TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies. Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads. Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies. Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples. Cold Spring Harbor Laboratory Press 2016-08 /pmc/articles/PMC4971766/ /pubmed/27252236 http://dx.doi.org/10.1101/gr.196469.115 Text en © 2016 Smith-Unna et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Method
Smith-Unna, Richard
Boursnell, Chris
Patro, Rob
Hibberd, Julian M.
Kelly, Steven
TransRate: reference-free quality assessment of de novo transcriptome assemblies
title TransRate: reference-free quality assessment of de novo transcriptome assemblies
title_full TransRate: reference-free quality assessment of de novo transcriptome assemblies
title_fullStr TransRate: reference-free quality assessment of de novo transcriptome assemblies
title_full_unstemmed TransRate: reference-free quality assessment of de novo transcriptome assemblies
title_short TransRate: reference-free quality assessment of de novo transcriptome assemblies
title_sort transrate: reference-free quality assessment of de novo transcriptome assemblies
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971766/
https://www.ncbi.nlm.nih.gov/pubmed/27252236
http://dx.doi.org/10.1101/gr.196469.115
work_keys_str_mv AT smithunnarichard transratereferencefreequalityassessmentofdenovotranscriptomeassemblies
AT boursnellchris transratereferencefreequalityassessmentofdenovotranscriptomeassemblies
AT patrorob transratereferencefreequalityassessmentofdenovotranscriptomeassemblies
AT hibberdjulianm transratereferencefreequalityassessmentofdenovotranscriptomeassemblies
AT kellysteven transratereferencefreequalityassessmentofdenovotranscriptomeassemblies