Cargando…

Computing evolutionary distinctiveness indices in large scale analysis

We present optimal linear time algorithms for computing the Shapley values and 'heightened evolutionary distinctiveness' (HED) scores for the set of taxa in a phylogenetic tree. We demonstrate the efficiency of these new algorithms by applying them to a set of 10,000 reasonable 5139-specie...

Descripción completa

Detalles Bibliográficos
Autores principales: Martyn, Iain, Kuhn, Tyler S, Mooers, Arne O, Moulton, Vincent, Spillner, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3353162/
https://www.ncbi.nlm.nih.gov/pubmed/22502588
http://dx.doi.org/10.1186/1748-7188-7-6
_version_ 1782233001263890432
author Martyn, Iain
Kuhn, Tyler S
Mooers, Arne O
Moulton, Vincent
Spillner, Andreas
author_facet Martyn, Iain
Kuhn, Tyler S
Mooers, Arne O
Moulton, Vincent
Spillner, Andreas
author_sort Martyn, Iain
collection PubMed
description We present optimal linear time algorithms for computing the Shapley values and 'heightened evolutionary distinctiveness' (HED) scores for the set of taxa in a phylogenetic tree. We demonstrate the efficiency of these new algorithms by applying them to a set of 10,000 reasonable 5139-species mammal trees. This is the first time these indices have been computed on such a large taxon and we contrast our finding with an ad-hoc index for mammals, fair proportion (FP), used by the Zoological Society of London's EDGE programme. Our empirical results follow expectations. In particular, the Shapley values are very strongly correlated with the FP scores, but provide a higher weight to the few monotremes that comprise the sister to all other mammals. We also find that the HED score, which measures a species' unique contribution to future subsets as function of the probability that close relatives will go extinct, is very sensitive to the estimated probabilities. When they are low, HED scores are less than FP scores, and approach the simple measure of a species' age. Deviations (like the Solendon genus of the West Indies) occur when sister species are both at high risk of extinction and their clade roots deep in the tree. Conversely, when endangered species have higher probabilities of being lost, HED scores can be greater than FP scores and species like the African elephant Loxondonta africana, the two solendons and the thumbless bat Furipterus horrens can move up the rankings. We suggest that conservation attention be applied to such species that carry genetic responsibility for imperiled close relatives. We also briefly discuss extensions of Shapley values and HED scores that are possible with the algorithms presented here.
format Online
Article
Text
id pubmed-3353162
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33531622012-05-16 Computing evolutionary distinctiveness indices in large scale analysis Martyn, Iain Kuhn, Tyler S Mooers, Arne O Moulton, Vincent Spillner, Andreas Algorithms Mol Biol Research We present optimal linear time algorithms for computing the Shapley values and 'heightened evolutionary distinctiveness' (HED) scores for the set of taxa in a phylogenetic tree. We demonstrate the efficiency of these new algorithms by applying them to a set of 10,000 reasonable 5139-species mammal trees. This is the first time these indices have been computed on such a large taxon and we contrast our finding with an ad-hoc index for mammals, fair proportion (FP), used by the Zoological Society of London's EDGE programme. Our empirical results follow expectations. In particular, the Shapley values are very strongly correlated with the FP scores, but provide a higher weight to the few monotremes that comprise the sister to all other mammals. We also find that the HED score, which measures a species' unique contribution to future subsets as function of the probability that close relatives will go extinct, is very sensitive to the estimated probabilities. When they are low, HED scores are less than FP scores, and approach the simple measure of a species' age. Deviations (like the Solendon genus of the West Indies) occur when sister species are both at high risk of extinction and their clade roots deep in the tree. Conversely, when endangered species have higher probabilities of being lost, HED scores can be greater than FP scores and species like the African elephant Loxondonta africana, the two solendons and the thumbless bat Furipterus horrens can move up the rankings. We suggest that conservation attention be applied to such species that carry genetic responsibility for imperiled close relatives. We also briefly discuss extensions of Shapley values and HED scores that are possible with the algorithms presented here. BioMed Central 2012-04-13 /pmc/articles/PMC3353162/ /pubmed/22502588 http://dx.doi.org/10.1186/1748-7188-7-6 Text en Copyright ©2012 Martyn et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Martyn, Iain
Kuhn, Tyler S
Mooers, Arne O
Moulton, Vincent
Spillner, Andreas
Computing evolutionary distinctiveness indices in large scale analysis
title Computing evolutionary distinctiveness indices in large scale analysis
title_full Computing evolutionary distinctiveness indices in large scale analysis
title_fullStr Computing evolutionary distinctiveness indices in large scale analysis
title_full_unstemmed Computing evolutionary distinctiveness indices in large scale analysis
title_short Computing evolutionary distinctiveness indices in large scale analysis
title_sort computing evolutionary distinctiveness indices in large scale analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3353162/
https://www.ncbi.nlm.nih.gov/pubmed/22502588
http://dx.doi.org/10.1186/1748-7188-7-6
work_keys_str_mv AT martyniain computingevolutionarydistinctivenessindicesinlargescaleanalysis
AT kuhntylers computingevolutionarydistinctivenessindicesinlargescaleanalysis
AT mooersarneo computingevolutionarydistinctivenessindicesinlargescaleanalysis
AT moultonvincent computingevolutionarydistinctivenessindicesinlargescaleanalysis
AT spillnerandreas computingevolutionarydistinctivenessindicesinlargescaleanalysis