Cargando…

More Accurate Transcript Assembly via Parameter Advising

Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We qu...

Descripción completa

Detalles Bibliográficos
Autores principales: Deblasio, Dan, Kim, Kwanho, Kingsford, Carl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Mary Ann Liebert, Inc., publishers 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415876/
https://www.ncbi.nlm.nih.gov/pubmed/32315544
http://dx.doi.org/10.1089/cmb.2019.0286
_version_ 1783569220601118720
author Deblasio, Dan
Kim, Kwanho
Kingsford, Carl
author_facet Deblasio, Dan
Kim, Kwanho
Kingsford, Carl
author_sort Deblasio, Dan
collection PubMed
description Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github.
format Online
Article
Text
id pubmed-7415876
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Mary Ann Liebert, Inc., publishers
record_format MEDLINE/PubMed
spelling pubmed-74158762020-08-10 More Accurate Transcript Assembly via Parameter Advising Deblasio, Dan Kim, Kwanho Kingsford, Carl J Comput Biol ICML 2019 Conference Papers Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github. Mary Ann Liebert, Inc., publishers 2020-08-01 2020-08-04 /pmc/articles/PMC7415876/ /pubmed/32315544 http://dx.doi.org/10.1089/cmb.2019.0286 Text en © Dan DeBlasio, et al., 2020. Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle ICML 2019 Conference Papers
Deblasio, Dan
Kim, Kwanho
Kingsford, Carl
More Accurate Transcript Assembly via Parameter Advising
title More Accurate Transcript Assembly via Parameter Advising
title_full More Accurate Transcript Assembly via Parameter Advising
title_fullStr More Accurate Transcript Assembly via Parameter Advising
title_full_unstemmed More Accurate Transcript Assembly via Parameter Advising
title_short More Accurate Transcript Assembly via Parameter Advising
title_sort more accurate transcript assembly via parameter advising
topic ICML 2019 Conference Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415876/
https://www.ncbi.nlm.nih.gov/pubmed/32315544
http://dx.doi.org/10.1089/cmb.2019.0286
work_keys_str_mv AT deblasiodan moreaccuratetranscriptassemblyviaparameteradvising
AT kimkwanho moreaccuratetranscriptassemblyviaparameteradvising
AT kingsfordcarl moreaccuratetranscriptassemblyviaparameteradvising