Cargando…

Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data

MOTIVATION: Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. RESULTS: We pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Morel, Benoit, Williams, Tom A, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838317/
https://www.ncbi.nlm.nih.gov/pubmed/36576010
http://dx.doi.org/10.1093/bioinformatics/btac832
_version_ 1784869257258991616
author Morel, Benoit
Williams, Tom A
Stamatakis, Alexandros
author_facet Morel, Benoit
Williams, Tom A
Stamatakis, Alexandros
author_sort Morel, Benoit
collection PubMed
description MOTIVATION: Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. RESULTS: We present Asteroid, a novel algorithm that infers an unrooted species tree from a set of unrooted gene trees. We show on both empirical and simulated datasets that Asteroid is substantially more accurate than ASTRAL and ASTRID for very high proportions (>80%) of missing data. Asteroid is several orders of magnitude faster than ASTRAL for datasets that contain thousands of genes. It offers advanced features such as parallelization, support value computation and support for multi-copy and multifurcating gene trees. AVAILABILITY AND IMPLEMENTATION: Asteroid is freely available at https://github.com/BenoitMorel/Asteroid. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9838317
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98383172023-01-17 Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data Morel, Benoit Williams, Tom A Stamatakis, Alexandros Bioinformatics Original Paper MOTIVATION: Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. RESULTS: We present Asteroid, a novel algorithm that infers an unrooted species tree from a set of unrooted gene trees. We show on both empirical and simulated datasets that Asteroid is substantially more accurate than ASTRAL and ASTRID for very high proportions (>80%) of missing data. Asteroid is several orders of magnitude faster than ASTRAL for datasets that contain thousands of genes. It offers advanced features such as parallelization, support value computation and support for multi-copy and multifurcating gene trees. AVAILABILITY AND IMPLEMENTATION: Asteroid is freely available at https://github.com/BenoitMorel/Asteroid. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-28 /pmc/articles/PMC9838317/ /pubmed/36576010 http://dx.doi.org/10.1093/bioinformatics/btac832 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Morel, Benoit
Williams, Tom A
Stamatakis, Alexandros
Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title_full Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title_fullStr Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title_full_unstemmed Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title_short Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
title_sort asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838317/
https://www.ncbi.nlm.nih.gov/pubmed/36576010
http://dx.doi.org/10.1093/bioinformatics/btac832
work_keys_str_mv AT morelbenoit asteroidanewalgorithmtoinferspeciestreesfromgenetreesunderhighproportionsofmissingdata
AT williamstoma asteroidanewalgorithmtoinferspeciestreesfromgenetreesunderhighproportionsofmissingdata
AT stamatakisalexandros asteroidanewalgorithmtoinferspeciestreesfromgenetreesunderhighproportionsofmissingdata