Cargando…
More Accurate Transcript Assembly via Parameter Advising
Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We qu...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Mary Ann Liebert, Inc., publishers
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415876/ https://www.ncbi.nlm.nih.gov/pubmed/32315544 http://dx.doi.org/10.1089/cmb.2019.0286 |
_version_ | 1783569220601118720 |
---|---|
author | Deblasio, Dan Kim, Kwanho Kingsford, Carl |
author_facet | Deblasio, Dan Kim, Kwanho Kingsford, Carl |
author_sort | Deblasio, Dan |
collection | PubMed |
description | Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github. |
format | Online Article Text |
id | pubmed-7415876 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Mary Ann Liebert, Inc., publishers |
record_format | MEDLINE/PubMed |
spelling | pubmed-74158762020-08-10 More Accurate Transcript Assembly via Parameter Advising Deblasio, Dan Kim, Kwanho Kingsford, Carl J Comput Biol ICML 2019 Conference Papers Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github. Mary Ann Liebert, Inc., publishers 2020-08-01 2020-08-04 /pmc/articles/PMC7415876/ /pubmed/32315544 http://dx.doi.org/10.1089/cmb.2019.0286 Text en © Dan DeBlasio, et al., 2020. Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | ICML 2019 Conference Papers Deblasio, Dan Kim, Kwanho Kingsford, Carl More Accurate Transcript Assembly via Parameter Advising |
title | More Accurate Transcript Assembly via Parameter Advising |
title_full | More Accurate Transcript Assembly via Parameter Advising |
title_fullStr | More Accurate Transcript Assembly via Parameter Advising |
title_full_unstemmed | More Accurate Transcript Assembly via Parameter Advising |
title_short | More Accurate Transcript Assembly via Parameter Advising |
title_sort | more accurate transcript assembly via parameter advising |
topic | ICML 2019 Conference Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415876/ https://www.ncbi.nlm.nih.gov/pubmed/32315544 http://dx.doi.org/10.1089/cmb.2019.0286 |
work_keys_str_mv | AT deblasiodan moreaccuratetranscriptassemblyviaparameteradvising AT kimkwanho moreaccuratetranscriptassemblyviaparameteradvising AT kingsfordcarl moreaccuratetranscriptassemblyviaparameteradvising |