Cargando…
Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high qu...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995051/ https://www.ncbi.nlm.nih.gov/pubmed/20639539 http://dx.doi.org/10.1093/nar/gkq625 |
_version_ | 1782193038837153792 |
---|---|
author | Aniba, Mohamed Radhouene Poch, Olivier Thompson, Julie D. |
author_facet | Aniba, Mohamed Radhouene Poch, Olivier Thompson, Julie D. |
author_sort | Aniba, Mohamed Radhouene |
collection | PubMed |
description | The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high quality, (ii) to identify strong and weak points of the algorithms, (iii) to measure the improvements introduced by new methods and (iv) to enable non-specialists to choose an appropriate tool. Here, we discuss the development of formal benchmarks, designed to represent the current problems encountered in the bioinformatics field. We consider several criteria for building good benchmarks and the advantages to be gained when they are used intelligently. To illustrate these principles, we present a more detailed discussion of benchmarks for multiple alignments of protein sequences. As in many other domains, significant progress has been achieved in the multiple alignment field and the datasets have become progressively more challenging as the existing algorithms have evolved. Finally, we propose directions for future developments that will ensure that the bioinformatics benchmarks correspond to the challenges posed by the high throughput data. |
format | Text |
id | pubmed-2995051 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29950512010-12-01 Issues in bioinformatics benchmarking: the case study of multiple sequence alignment Aniba, Mohamed Radhouene Poch, Olivier Thompson, Julie D. Nucleic Acids Res Survey and Summary The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high quality, (ii) to identify strong and weak points of the algorithms, (iii) to measure the improvements introduced by new methods and (iv) to enable non-specialists to choose an appropriate tool. Here, we discuss the development of formal benchmarks, designed to represent the current problems encountered in the bioinformatics field. We consider several criteria for building good benchmarks and the advantages to be gained when they are used intelligently. To illustrate these principles, we present a more detailed discussion of benchmarks for multiple alignments of protein sequences. As in many other domains, significant progress has been achieved in the multiple alignment field and the datasets have become progressively more challenging as the existing algorithms have evolved. Finally, we propose directions for future developments that will ensure that the bioinformatics benchmarks correspond to the challenges posed by the high throughput data. Oxford University Press 2010-11 2010-07-17 /pmc/articles/PMC2995051/ /pubmed/20639539 http://dx.doi.org/10.1093/nar/gkq625 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Survey and Summary Aniba, Mohamed Radhouene Poch, Olivier Thompson, Julie D. Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title | Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title_full | Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title_fullStr | Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title_full_unstemmed | Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title_short | Issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
title_sort | issues in bioinformatics benchmarking: the case study of multiple sequence alignment |
topic | Survey and Summary |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995051/ https://www.ncbi.nlm.nih.gov/pubmed/20639539 http://dx.doi.org/10.1093/nar/gkq625 |
work_keys_str_mv | AT anibamohamedradhouene issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment AT pocholivier issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment AT thompsonjulied issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment |