Cargando…

tRNA functional signatures classify plastids as late-branching cyanobacteria

BACKGROUND: Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plast...

Descripción completa

Detalles Bibliográficos
Autores principales: Lawrence, Travis J, Amrine, Katherine CH, Swingley, Wesley D, Ardell, David H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902448/
https://www.ncbi.nlm.nih.gov/pubmed/31818253
http://dx.doi.org/10.1186/s12862-019-1552-7
_version_ 1783477669301583872
author Lawrence, Travis J
Amrine, Katherine CH
Swingley, Wesley D
Ardell, David H
author_facet Lawrence, Travis J
Amrine, Katherine CH
Swingley, Wesley D
Ardell, David H
author_sort Lawrence, Travis J
collection PubMed
description BACKGROUND: Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data. RESULTS: Using Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies (“phyloclassifies”) cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data. CONCLUSIONS: Phylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies.
format Online
Article
Text
id pubmed-6902448
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69024482019-12-11 tRNA functional signatures classify plastids as late-branching cyanobacteria Lawrence, Travis J Amrine, Katherine CH Swingley, Wesley D Ardell, David H BMC Evol Biol Research Article BACKGROUND: Eukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data. RESULTS: Using Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies (“phyloclassifies”) cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data. CONCLUSIONS: Phylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies. BioMed Central 2019-12-09 /pmc/articles/PMC6902448/ /pubmed/31818253 http://dx.doi.org/10.1186/s12862-019-1552-7 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Lawrence, Travis J
Amrine, Katherine CH
Swingley, Wesley D
Ardell, David H
tRNA functional signatures classify plastids as late-branching cyanobacteria
title tRNA functional signatures classify plastids as late-branching cyanobacteria
title_full tRNA functional signatures classify plastids as late-branching cyanobacteria
title_fullStr tRNA functional signatures classify plastids as late-branching cyanobacteria
title_full_unstemmed tRNA functional signatures classify plastids as late-branching cyanobacteria
title_short tRNA functional signatures classify plastids as late-branching cyanobacteria
title_sort trna functional signatures classify plastids as late-branching cyanobacteria
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902448/
https://www.ncbi.nlm.nih.gov/pubmed/31818253
http://dx.doi.org/10.1186/s12862-019-1552-7
work_keys_str_mv AT lawrencetravisj trnafunctionalsignaturesclassifyplastidsaslatebranchingcyanobacteria
AT amrinekatherinech trnafunctionalsignaturesclassifyplastidsaslatebranchingcyanobacteria
AT swingleywesleyd trnafunctionalsignaturesclassifyplastidsaslatebranchingcyanobacteria
AT ardelldavidh trnafunctionalsignaturesclassifyplastidsaslatebranchingcyanobacteria