Cargando…

Disk covering methods improve phylogenomic analyses

MOTIVATION: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is ofte...

Descripción completa

Detalles Bibliográficos
Autores principales: Bayzid, Md Shamsuzzoha, Hunt, Tyler, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4239662/
https://www.ncbi.nlm.nih.gov/pubmed/25572610
http://dx.doi.org/10.1186/1471-2164-15-S6-S7
Descripción
Sumario:MOTIVATION: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. RESULTS: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.