Cargando…

Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference

We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new align...

Descripción completa

Detalles Bibliográficos
Autores principales: Linder, C. Randal, Suri, Rahul, Liu, Kevin, Warnow, Tandy
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2989560/
https://www.ncbi.nlm.nih.gov/pubmed/21113335
http://dx.doi.org/10.1371/currents.RRN1195
_version_ 1782192365763559424
author Linder, C. Randal
Suri, Rahul
Liu, Kevin
Warnow, Tandy
author_facet Linder, C. Randal
Suri, Rahul
Liu, Kevin
Warnow, Tandy
author_sort Linder, C. Randal
collection PubMed
description We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new alignment and tree inference methods on very difficult datasets. The datasets are intended to help address three problems: multiple sequence alignment, phylogeny estimation given aligned sequences, and supertree estimation. Datasets from our work include empirical datasets with carefully curated alignments suitable for testing alignment and phylogenetic methods for large-scale systematics studies. Links to other empirical datasets, lacking curated alignments, are also provided. We also include simulated datasets with properties typical of large-scale systematics studies, including high rates of substitutions and indels, and we include the true alignment and tree for each simulated dataset. Finally, we provide links to software tools for generating simulated datasets, and for evaluating the accuracy of alignments and trees estimated on these datasets. We welcome contributions to the benchmark datasets from other researchers.
format Text
id pubmed-2989560
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29895602010-11-24 Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference Linder, C. Randal Suri, Rahul Liu, Kevin Warnow, Tandy PLoS Curr Tree of Life We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new alignment and tree inference methods on very difficult datasets. The datasets are intended to help address three problems: multiple sequence alignment, phylogeny estimation given aligned sequences, and supertree estimation. Datasets from our work include empirical datasets with carefully curated alignments suitable for testing alignment and phylogenetic methods for large-scale systematics studies. Links to other empirical datasets, lacking curated alignments, are also provided. We also include simulated datasets with properties typical of large-scale systematics studies, including high rates of substitutions and indels, and we include the true alignment and tree for each simulated dataset. Finally, we provide links to software tools for generating simulated datasets, and for evaluating the accuracy of alignments and trees estimated on these datasets. We welcome contributions to the benchmark datasets from other researchers. Public Library of Science 2010-11-18 /pmc/articles/PMC2989560/ /pubmed/21113335 http://dx.doi.org/10.1371/currents.RRN1195 Text en http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Tree of Life
Linder, C. Randal
Suri, Rahul
Liu, Kevin
Warnow, Tandy
Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title_full Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title_fullStr Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title_full_unstemmed Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title_short Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
title_sort benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
topic Tree of Life
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2989560/
https://www.ncbi.nlm.nih.gov/pubmed/21113335
http://dx.doi.org/10.1371/currents.RRN1195
work_keys_str_mv AT lindercrandal benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference
AT surirahul benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference
AT liukevin benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference
AT warnowtandy benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference