Cargando…

TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity

Sequence annotation is fundamental for studying the evolution of protein families, particularly when working with nonmodel species. Given the rapid, ever-increasing number of species receiving high-quality genome sequencing, accurate domain modeling that is representative of species diversity is cru...

Descripción completa

Detalles Bibliográficos
Autores principales: Tassia, Michael G, David, Kyle T, Townsend, James P, Halanych, Kenneth M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8662601/
https://www.ncbi.nlm.nih.gov/pubmed/34459919
http://dx.doi.org/10.1093/molbev/msab258
_version_ 1784613472416301056
author Tassia, Michael G
David, Kyle T
Townsend, James P
Halanych, Kenneth M
author_facet Tassia, Michael G
David, Kyle T
Townsend, James P
Halanych, Kenneth M
author_sort Tassia, Michael G
collection PubMed
description Sequence annotation is fundamental for studying the evolution of protein families, particularly when working with nonmodel species. Given the rapid, ever-increasing number of species receiving high-quality genome sequencing, accurate domain modeling that is representative of species diversity is crucial for understanding protein family sequence evolution and their inferred function(s). Here, we describe a bioinformatic tool called Taxon-Informed Adjustment of Markov Model Attributes (TIAMMAt) which revises domain profile hidden Markov models (HMMs) by incorporating homologous domain sequences from underrepresented and nonmodel species. Using innate immunity pathways as a case study, we show that revising profile HMM parameters to directly account for variation in homologs among underrepresented species provides valuable insight into the evolution of protein families. Following adjustment by TIAMMAt, domain profile HMMs exhibit changes in their per-site amino acid state emission probabilities and insertion/deletion probabilities while maintaining the overall structure of the consensus sequence. Our results show that domain revision can heavily impact evolutionary interpretations for some families (i.e., NLR’s NACHT domain), whereas impact on other domains (e.g., rel homology domain and interferon regulatory factor domains) is minimal due to high levels of sequence conservation across the sampled phylogenetic depth (i.e., Metazoa). Importantly, TIAMMAt revises target domain models to reflect homologous sequence variation using the taxonomic distribution under consideration by the user. TIAMMAt’s flexibility to revise any subset of the Pfam database using a user-defined taxonomic pool will make it a valuable tool for future protein evolution studies, particularly when incorporating (or focusing) on nonmodel species.
format Online
Article
Text
id pubmed-8662601
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86626012021-12-10 TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity Tassia, Michael G David, Kyle T Townsend, James P Halanych, Kenneth M Mol Biol Evol Resources Sequence annotation is fundamental for studying the evolution of protein families, particularly when working with nonmodel species. Given the rapid, ever-increasing number of species receiving high-quality genome sequencing, accurate domain modeling that is representative of species diversity is crucial for understanding protein family sequence evolution and their inferred function(s). Here, we describe a bioinformatic tool called Taxon-Informed Adjustment of Markov Model Attributes (TIAMMAt) which revises domain profile hidden Markov models (HMMs) by incorporating homologous domain sequences from underrepresented and nonmodel species. Using innate immunity pathways as a case study, we show that revising profile HMM parameters to directly account for variation in homologs among underrepresented species provides valuable insight into the evolution of protein families. Following adjustment by TIAMMAt, domain profile HMMs exhibit changes in their per-site amino acid state emission probabilities and insertion/deletion probabilities while maintaining the overall structure of the consensus sequence. Our results show that domain revision can heavily impact evolutionary interpretations for some families (i.e., NLR’s NACHT domain), whereas impact on other domains (e.g., rel homology domain and interferon regulatory factor domains) is minimal due to high levels of sequence conservation across the sampled phylogenetic depth (i.e., Metazoa). Importantly, TIAMMAt revises target domain models to reflect homologous sequence variation using the taxonomic distribution under consideration by the user. TIAMMAt’s flexibility to revise any subset of the Pfam database using a user-defined taxonomic pool will make it a valuable tool for future protein evolution studies, particularly when incorporating (or focusing) on nonmodel species. Oxford University Press 2021-08-30 /pmc/articles/PMC8662601/ /pubmed/34459919 http://dx.doi.org/10.1093/molbev/msab258 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Resources
Tassia, Michael G
David, Kyle T
Townsend, James P
Halanych, Kenneth M
TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title_full TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title_fullStr TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title_full_unstemmed TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title_short TIAMMAt: Leveraging Biodiversity to Revise Protein Domain Models, Evidence from Innate Immunity
title_sort tiammat: leveraging biodiversity to revise protein domain models, evidence from innate immunity
topic Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8662601/
https://www.ncbi.nlm.nih.gov/pubmed/34459919
http://dx.doi.org/10.1093/molbev/msab258
work_keys_str_mv AT tassiamichaelg tiammatleveragingbiodiversitytoreviseproteindomainmodelsevidencefrominnateimmunity
AT davidkylet tiammatleveragingbiodiversitytoreviseproteindomainmodelsevidencefrominnateimmunity
AT townsendjamesp tiammatleveragingbiodiversitytoreviseproteindomainmodelsevidencefrominnateimmunity
AT halanychkennethm tiammatleveragingbiodiversitytoreviseproteindomainmodelsevidencefrominnateimmunity