Cargando…
Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference
We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new align...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2989560/ https://www.ncbi.nlm.nih.gov/pubmed/21113335 http://dx.doi.org/10.1371/currents.RRN1195 |
_version_ | 1782192365763559424 |
---|---|
author | Linder, C. Randal Suri, Rahul Liu, Kevin Warnow, Tandy |
author_facet | Linder, C. Randal Suri, Rahul Liu, Kevin Warnow, Tandy |
author_sort | Linder, C. Randal |
collection | PubMed |
description | We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new alignment and tree inference methods on very difficult datasets. The datasets are intended to help address three problems: multiple sequence alignment, phylogeny estimation given aligned sequences, and supertree estimation. Datasets from our work include empirical datasets with carefully curated alignments suitable for testing alignment and phylogenetic methods for large-scale systematics studies. Links to other empirical datasets, lacking curated alignments, are also provided. We also include simulated datasets with properties typical of large-scale systematics studies, including high rates of substitutions and indels, and we include the true alignment and tree for each simulated dataset. Finally, we provide links to software tools for generating simulated datasets, and for evaluating the accuracy of alignments and trees estimated on these datasets. We welcome contributions to the benchmark datasets from other researchers. |
format | Text |
id | pubmed-2989560 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-29895602010-11-24 Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference Linder, C. Randal Suri, Rahul Liu, Kevin Warnow, Tandy PLoS Curr Tree of Life We have assembled a collection of web pages that contain benchmark datasets and software tools to enable the evaluation of the accuracy and scalability of computational methods for estimating evolutionary relationships. They provide a resource to the scientific community for development of new alignment and tree inference methods on very difficult datasets. The datasets are intended to help address three problems: multiple sequence alignment, phylogeny estimation given aligned sequences, and supertree estimation. Datasets from our work include empirical datasets with carefully curated alignments suitable for testing alignment and phylogenetic methods for large-scale systematics studies. Links to other empirical datasets, lacking curated alignments, are also provided. We also include simulated datasets with properties typical of large-scale systematics studies, including high rates of substitutions and indels, and we include the true alignment and tree for each simulated dataset. Finally, we provide links to software tools for generating simulated datasets, and for evaluating the accuracy of alignments and trees estimated on these datasets. We welcome contributions to the benchmark datasets from other researchers. Public Library of Science 2010-11-18 /pmc/articles/PMC2989560/ /pubmed/21113335 http://dx.doi.org/10.1371/currents.RRN1195 Text en http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Tree of Life Linder, C. Randal Suri, Rahul Liu, Kevin Warnow, Tandy Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title | Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title_full | Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title_fullStr | Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title_full_unstemmed | Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title_short | Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
title_sort | benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference |
topic | Tree of Life |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2989560/ https://www.ncbi.nlm.nih.gov/pubmed/21113335 http://dx.doi.org/10.1371/currents.RRN1195 |
work_keys_str_mv | AT lindercrandal benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference AT surirahul benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference AT liukevin benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference AT warnowtandy benchmarkdatasetsandsoftwarefordevelopingandtestingmethodsforlargescalemultiplesequencealignmentandphylogeneticinference |