Cargando…

Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets

The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Brian Tilston, Mauck, William M, Benz, Brett W, Andersen, Michael J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7486955/
https://www.ncbi.nlm.nih.gov/pubmed/32470111
http://dx.doi.org/10.1093/gbe/evaa113
_version_ 1783581410221621248
author Smith, Brian Tilston
Mauck, William M
Benz, Brett W
Andersen, Michael J
author_facet Smith, Brian Tilston
Mauck, William M
Benz, Brett W
Andersen, Michael J
author_sort Smith, Brian Tilston
collection PubMed
description The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9× more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets.
format Online
Article
Text
id pubmed-7486955
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-74869552020-09-15 Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets Smith, Brian Tilston Mauck, William M Benz, Brett W Andersen, Michael J Genome Biol Evol Research Article The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9× more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets. Oxford University Press 2020-05-29 /pmc/articles/PMC7486955/ /pubmed/32470111 http://dx.doi.org/10.1093/gbe/evaa113 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Smith, Brian Tilston
Mauck, William M
Benz, Brett W
Andersen, Michael J
Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title_full Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title_fullStr Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title_full_unstemmed Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title_short Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets
title_sort uneven missing data skew phylogenomic relationships within the lories and lorikeets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7486955/
https://www.ncbi.nlm.nih.gov/pubmed/32470111
http://dx.doi.org/10.1093/gbe/evaa113
work_keys_str_mv AT smithbriantilston unevenmissingdataskewphylogenomicrelationshipswithintheloriesandlorikeets
AT mauckwilliamm unevenmissingdataskewphylogenomicrelationshipswithintheloriesandlorikeets
AT benzbrettw unevenmissingdataskewphylogenomicrelationshipswithintheloriesandlorikeets
AT andersenmichaelj unevenmissingdataskewphylogenomicrelationshipswithintheloriesandlorikeets