Cargando…

Issues in bioinformatics benchmarking: the case study of multiple sequence alignment

The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high qu...

Descripción completa

Detalles Bibliográficos
Autores principales: Aniba, Mohamed Radhouene, Poch, Olivier, Thompson, Julie D.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995051/
https://www.ncbi.nlm.nih.gov/pubmed/20639539
http://dx.doi.org/10.1093/nar/gkq625
_version_ 1782193038837153792
author Aniba, Mohamed Radhouene
Poch, Olivier
Thompson, Julie D.
author_facet Aniba, Mohamed Radhouene
Poch, Olivier
Thompson, Julie D.
author_sort Aniba, Mohamed Radhouene
collection PubMed
description The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high quality, (ii) to identify strong and weak points of the algorithms, (iii) to measure the improvements introduced by new methods and (iv) to enable non-specialists to choose an appropriate tool. Here, we discuss the development of formal benchmarks, designed to represent the current problems encountered in the bioinformatics field. We consider several criteria for building good benchmarks and the advantages to be gained when they are used intelligently. To illustrate these principles, we present a more detailed discussion of benchmarks for multiple alignments of protein sequences. As in many other domains, significant progress has been achieved in the multiple alignment field and the datasets have become progressively more challenging as the existing algorithms have evolved. Finally, we propose directions for future developments that will ensure that the bioinformatics benchmarks correspond to the challenges posed by the high throughput data.
format Text
id pubmed-2995051
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29950512010-12-01 Issues in bioinformatics benchmarking: the case study of multiple sequence alignment Aniba, Mohamed Radhouene Poch, Olivier Thompson, Julie D. Nucleic Acids Res Survey and Summary The post-genomic era presents many new challenges for the field of bioinformatics. Novel computational approaches are now being developed to handle the large, complex and noisy datasets produced by high throughput technologies. Objective evaluation of these methods is essential (i) to assure high quality, (ii) to identify strong and weak points of the algorithms, (iii) to measure the improvements introduced by new methods and (iv) to enable non-specialists to choose an appropriate tool. Here, we discuss the development of formal benchmarks, designed to represent the current problems encountered in the bioinformatics field. We consider several criteria for building good benchmarks and the advantages to be gained when they are used intelligently. To illustrate these principles, we present a more detailed discussion of benchmarks for multiple alignments of protein sequences. As in many other domains, significant progress has been achieved in the multiple alignment field and the datasets have become progressively more challenging as the existing algorithms have evolved. Finally, we propose directions for future developments that will ensure that the bioinformatics benchmarks correspond to the challenges posed by the high throughput data. Oxford University Press 2010-11 2010-07-17 /pmc/articles/PMC2995051/ /pubmed/20639539 http://dx.doi.org/10.1093/nar/gkq625 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Survey and Summary
Aniba, Mohamed Radhouene
Poch, Olivier
Thompson, Julie D.
Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title_full Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title_fullStr Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title_full_unstemmed Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title_short Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
title_sort issues in bioinformatics benchmarking: the case study of multiple sequence alignment
topic Survey and Summary
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2995051/
https://www.ncbi.nlm.nih.gov/pubmed/20639539
http://dx.doi.org/10.1093/nar/gkq625
work_keys_str_mv AT anibamohamedradhouene issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment
AT pocholivier issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment
AT thompsonjulied issuesinbioinformaticsbenchmarkingthecasestudyofmultiplesequencealignment