Cargando…

Realistic scenarios of missing taxa in phylogenetic comparative methods and their effects on model selection and parameter estimation

Model-based analyses of continuous trait evolution enable rich evolutionary insight. These analyses require a phylogenetic tree and a vector of trait values for the tree’s terminal taxa, but rarely do a tree and dataset include all taxa within a clade. Because the probability that a taxon is include...

Descripción completa

Detalles Bibliográficos
Autor principal: Marcondes, Rafael S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6791351/
https://www.ncbi.nlm.nih.gov/pubmed/31616606
http://dx.doi.org/10.7717/peerj.7917
Descripción
Sumario:Model-based analyses of continuous trait evolution enable rich evolutionary insight. These analyses require a phylogenetic tree and a vector of trait values for the tree’s terminal taxa, but rarely do a tree and dataset include all taxa within a clade. Because the probability that a taxon is included in a dataset depends on ecological traits that have phylogenetic signal, missing taxa in real datasets should be expected to be phylogenetically clumped or correlated to the modelled trait. I examined whether those types of missing taxa represent a problem for model selection and parameter estimation. I simulated univariate traits under a suite of Brownian Motion and Ornstein-Uhlenbeck models, and assessed the performance of model selection and parameter estimation under absent, random, clumped or correlated missing taxa. I found that those analyses perform well under almost all scenarios, including situations with very sparsely sampled phylogenies. The only notable biases I detected were in parameter estimation under a very high percentage (90%) of correlated missing taxa. My results offer a degree of reassurance for studies of continuous trait evolution with missing taxa, but the problem of missing taxa in phylogenetic comparative methods still demands much further investigation. The framework I have described here might provide a starting point for future work.