Cargando…

The State of Software for Evolutionary Biology

With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number o...

Descripción completa

Detalles Bibliográficos
Autores principales: Darriba, Diego, Flouri, Tomáš, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5913673/
https://www.ncbi.nlm.nih.gov/pubmed/29385525
http://dx.doi.org/10.1093/molbev/msy014
_version_ 1783316582598967296
author Darriba, Diego
Flouri, Tomáš
Stamatakis, Alexandros
author_facet Darriba, Diego
Flouri, Tomáš
Stamatakis, Alexandros
author_sort Darriba, Diego
collection PubMed
description With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number of features as well as lines of code and consequently, also with respect to software complexity. A topic that has received little attention is the software engineering quality of widely used core analysis tools. Software developers appear to rarely assess the quality of their code, and this can have potential negative consequences for end-users. To this end, we assessed the code quality of 16 highly cited and compute-intensive tools mainly written in C/C++ (e.g., MrBayes, MAFFT, SweepFinder, etc.) and JAVA (BEAST) from the broader area of evolutionary biology that are being routinely used in current data analysis pipelines. Because, the software engineering quality of the tools we analyzed is rather unsatisfying, we provide a list of best practices for improving the quality of existing tools and list techniques that can be deployed for developing reliable, high quality scientific software from scratch. Finally, we also discuss journal as well as science policy and, more importantly, funding issues that need to be addressed for improving software engineering quality as well as ensuring support for developing new and maintaining existing software. Our intention is to raise the awareness of the community regarding software engineering quality issues and to emphasize the substantial lack of funding for scientific software development.
format Online
Article
Text
id pubmed-5913673
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-59136732018-04-30 The State of Software for Evolutionary Biology Darriba, Diego Flouri, Tomáš Stamatakis, Alexandros Mol Biol Evol Review With Next Generation Sequencing data being routinely used, evolutionary biology is transforming into a computational science. Thus, researchers have to rely on a growing number of increasingly complex software. All widely used core tools in the field have grown considerably, in terms of the number of features as well as lines of code and consequently, also with respect to software complexity. A topic that has received little attention is the software engineering quality of widely used core analysis tools. Software developers appear to rarely assess the quality of their code, and this can have potential negative consequences for end-users. To this end, we assessed the code quality of 16 highly cited and compute-intensive tools mainly written in C/C++ (e.g., MrBayes, MAFFT, SweepFinder, etc.) and JAVA (BEAST) from the broader area of evolutionary biology that are being routinely used in current data analysis pipelines. Because, the software engineering quality of the tools we analyzed is rather unsatisfying, we provide a list of best practices for improving the quality of existing tools and list techniques that can be deployed for developing reliable, high quality scientific software from scratch. Finally, we also discuss journal as well as science policy and, more importantly, funding issues that need to be addressed for improving software engineering quality as well as ensuring support for developing new and maintaining existing software. Our intention is to raise the awareness of the community regarding software engineering quality issues and to emphasize the substantial lack of funding for scientific software development. Oxford University Press 2018-05 2018-01-29 /pmc/articles/PMC5913673/ /pubmed/29385525 http://dx.doi.org/10.1093/molbev/msy014 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Review
Darriba, Diego
Flouri, Tomáš
Stamatakis, Alexandros
The State of Software for Evolutionary Biology
title The State of Software for Evolutionary Biology
title_full The State of Software for Evolutionary Biology
title_fullStr The State of Software for Evolutionary Biology
title_full_unstemmed The State of Software for Evolutionary Biology
title_short The State of Software for Evolutionary Biology
title_sort state of software for evolutionary biology
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5913673/
https://www.ncbi.nlm.nih.gov/pubmed/29385525
http://dx.doi.org/10.1093/molbev/msy014
work_keys_str_mv AT darribadiego thestateofsoftwareforevolutionarybiology
AT flouritomas thestateofsoftwareforevolutionarybiology
AT stamatakisalexandros thestateofsoftwareforevolutionarybiology
AT darribadiego stateofsoftwareforevolutionarybiology
AT flouritomas stateofsoftwareforevolutionarybiology
AT stamatakisalexandros stateofsoftwareforevolutionarybiology