Cargando…

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

MOTIVATION: Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natura...

Descripción completa

Detalles Bibliográficos
Autores principales: González-Beltrán, Alejandra, Li, Peter, Zhao, Jun, Avila-Garcia, Maria Susana, Roos, Marco, Thompson, Mark, van der Horst, Eelke, Kaliyaperumal, Rajaram, Luo, Ruibang, Lee, Tin-Lap, Lam, Tak-wah, Edmunds, Scott C., Sansone, Susanna-Assunta, Rocca-Serra, Philippe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4495984/
https://www.ncbi.nlm.nih.gov/pubmed/26154165
http://dx.doi.org/10.1371/journal.pone.0127612
_version_ 1782380327682965504
author González-Beltrán, Alejandra
Li, Peter
Zhao, Jun
Avila-Garcia, Maria Susana
Roos, Marco
Thompson, Mark
van der Horst, Eelke
Kaliyaperumal, Rajaram
Luo, Ruibang
Lee, Tin-Lap
Lam, Tak-wah
Edmunds, Scott C.
Sansone, Susanna-Assunta
Rocca-Serra, Philippe
author_facet González-Beltrán, Alejandra
Li, Peter
Zhao, Jun
Avila-Garcia, Maria Susana
Roos, Marco
Thompson, Mark
van der Horst, Eelke
Kaliyaperumal, Rajaram
Luo, Ruibang
Lee, Tin-Lap
Lam, Tak-wah
Edmunds, Scott C.
Sansone, Susanna-Assunta
Rocca-Serra, Philippe
author_sort González-Beltrán, Alejandra
collection PubMed
description MOTIVATION: Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler. RESULTS: Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata. AVAILABILITY: SOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca-serra@oerc.ox.ac.uk and susanna-assunta.sansone@oerc.ox.ac.uk.
format Online
Article
Text
id pubmed-4495984
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44959842015-07-15 From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics González-Beltrán, Alejandra Li, Peter Zhao, Jun Avila-Garcia, Maria Susana Roos, Marco Thompson, Mark van der Horst, Eelke Kaliyaperumal, Rajaram Luo, Ruibang Lee, Tin-Lap Lam, Tak-wah Edmunds, Scott C. Sansone, Susanna-Assunta Rocca-Serra, Philippe PLoS One Research Article MOTIVATION: Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler. RESULTS: Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata. AVAILABILITY: SOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca-serra@oerc.ox.ac.uk and susanna-assunta.sansone@oerc.ox.ac.uk. Public Library of Science 2015-07-08 /pmc/articles/PMC4495984/ /pubmed/26154165 http://dx.doi.org/10.1371/journal.pone.0127612 Text en © 2015 González-Beltrán et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
González-Beltrán, Alejandra
Li, Peter
Zhao, Jun
Avila-Garcia, Maria Susana
Roos, Marco
Thompson, Mark
van der Horst, Eelke
Kaliyaperumal, Rajaram
Luo, Ruibang
Lee, Tin-Lap
Lam, Tak-wah
Edmunds, Scott C.
Sansone, Susanna-Assunta
Rocca-Serra, Philippe
From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title_full From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title_fullStr From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title_full_unstemmed From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title_short From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics
title_sort from peer-reviewed to peer-reproduced in scholarly publishing: the complementary roles of data models and workflows in bioinformatics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4495984/
https://www.ncbi.nlm.nih.gov/pubmed/26154165
http://dx.doi.org/10.1371/journal.pone.0127612
work_keys_str_mv AT gonzalezbeltranalejandra frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT lipeter frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT zhaojun frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT avilagarciamariasusana frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT roosmarco frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT thompsonmark frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT vanderhorsteelke frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT kaliyaperumalrajaram frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT luoruibang frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT leetinlap frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT lamtakwah frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT edmundsscottc frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT sansonesusannaassunta frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics
AT roccaserraphilippe frompeerreviewedtopeerreproducedinscholarlypublishingthecomplementaryrolesofdatamodelsandworkflowsinbioinformatics