Cargando…

Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasi...

Descripción completa

Detalles Bibliográficos
Autores principales: Bryce-Smith, Sam, Burri, Dominik, Gazzara, Matthew R., Herrmann, Christina J., Danecka, Weronika, Fitzsimmons, Christina M., Wan, Yuk Kei, Zhuang, Farica, Fansler, Mervin M., Fernández, José M., Ferret, Meritxell, Gonzalez-Uriarte, Asier, Haynes, Samuel, Herdman, Chelsea, Kanitz, Alexander, Katsantoni, Maria, Marini, Federico, McDonnel, Euan, Nicolet, Ben, Poon, Chi-Lam, Rot, Gregor, Schärfen, Leonard, Wu, Pin-Jou, Yoon, Yoseop, Barash, Yoseph, Zavolan, Mihaela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653393/
https://www.ncbi.nlm.nih.gov/pubmed/37816550
http://dx.doi.org/10.1261/rna.079849.123
_version_ 1785147766998040576
author Bryce-Smith, Sam
Burri, Dominik
Gazzara, Matthew R.
Herrmann, Christina J.
Danecka, Weronika
Fitzsimmons, Christina M.
Wan, Yuk Kei
Zhuang, Farica
Fansler, Mervin M.
Fernández, José M.
Ferret, Meritxell
Gonzalez-Uriarte, Asier
Haynes, Samuel
Herdman, Chelsea
Kanitz, Alexander
Katsantoni, Maria
Marini, Federico
McDonnel, Euan
Nicolet, Ben
Poon, Chi-Lam
Rot, Gregor
Schärfen, Leonard
Wu, Pin-Jou
Yoon, Yoseop
Barash, Yoseph
Zavolan, Mihaela
author_facet Bryce-Smith, Sam
Burri, Dominik
Gazzara, Matthew R.
Herrmann, Christina J.
Danecka, Weronika
Fitzsimmons, Christina M.
Wan, Yuk Kei
Zhuang, Farica
Fansler, Mervin M.
Fernández, José M.
Ferret, Meritxell
Gonzalez-Uriarte, Asier
Haynes, Samuel
Herdman, Chelsea
Kanitz, Alexander
Katsantoni, Maria
Marini, Federico
McDonnel, Euan
Nicolet, Ben
Poon, Chi-Lam
Rot, Gregor
Schärfen, Leonard
Wu, Pin-Jou
Yoon, Yoseop
Barash, Yoseph
Zavolan, Mihaela
author_sort Bryce-Smith, Sam
collection PubMed
description The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3′-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
format Online
Article
Text
id pubmed-10653393
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-106533932023-12-01 Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data Bryce-Smith, Sam Burri, Dominik Gazzara, Matthew R. Herrmann, Christina J. Danecka, Weronika Fitzsimmons, Christina M. Wan, Yuk Kei Zhuang, Farica Fansler, Mervin M. Fernández, José M. Ferret, Meritxell Gonzalez-Uriarte, Asier Haynes, Samuel Herdman, Chelsea Kanitz, Alexander Katsantoni, Maria Marini, Federico McDonnel, Euan Nicolet, Ben Poon, Chi-Lam Rot, Gregor Schärfen, Leonard Wu, Pin-Jou Yoon, Yoseop Barash, Yoseph Zavolan, Mihaela RNA Bioinformatics: Benchmark The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3′-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets. Cold Spring Harbor Laboratory Press 2023-12 /pmc/articles/PMC10653393/ /pubmed/37816550 http://dx.doi.org/10.1261/rna.079849.123 Text en © 2023 Bryce-Smith et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society https://creativecommons.org/licenses/by-nc/4.0/This article, published in RNA, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Bioinformatics: Benchmark
Bryce-Smith, Sam
Burri, Dominik
Gazzara, Matthew R.
Herrmann, Christina J.
Danecka, Weronika
Fitzsimmons, Christina M.
Wan, Yuk Kei
Zhuang, Farica
Fansler, Mervin M.
Fernández, José M.
Ferret, Meritxell
Gonzalez-Uriarte, Asier
Haynes, Samuel
Herdman, Chelsea
Kanitz, Alexander
Katsantoni, Maria
Marini, Federico
McDonnel, Euan
Nicolet, Ben
Poon, Chi-Lam
Rot, Gregor
Schärfen, Leonard
Wu, Pin-Jou
Yoon, Yoseop
Barash, Yoseph
Zavolan, Mihaela
Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title_full Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title_fullStr Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title_full_unstemmed Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title_short Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data
title_sort extensible benchmarking of methods that identify and quantify polyadenylation sites from rna-seq data
topic Bioinformatics: Benchmark
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653393/
https://www.ncbi.nlm.nih.gov/pubmed/37816550
http://dx.doi.org/10.1261/rna.079849.123
work_keys_str_mv AT brycesmithsam extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT burridominik extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT gazzaramatthewr extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT herrmannchristinaj extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT daneckaweronika extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT fitzsimmonschristinam extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT wanyukkei extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT zhuangfarica extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT fanslermervinm extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT fernandezjosem extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT ferretmeritxell extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT gonzalezuriarteasier extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT haynessamuel extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT herdmanchelsea extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT kanitzalexander extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT katsantonimaria extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT marinifederico extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT mcdonneleuan extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT nicoletben extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT poonchilam extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT rotgregor extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT scharfenleonard extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT wupinjou extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT yoonyoseop extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT barashyoseph extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata
AT zavolanmihaela extensiblebenchmarkingofmethodsthatidentifyandquantifypolyadenylationsitesfromrnaseqdata