Cargando…
Structure-Aware Annotation of Leucine-rich Repeat Domains
Protein domain annotation is typically done by predictive models such as HMMs trained on sequence motifs. However, sequence-based annotation methods are prone to error, particularly in calling domain boundaries and motifs within them. These methods are limited by a lack of structural information acc...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634995/ https://www.ncbi.nlm.nih.gov/pubmed/37961157 http://dx.doi.org/10.1101/2023.10.27.562987 |
_version_ | 1785146272434356224 |
---|---|
author | Xu, Boyan Cerbu, Alois Lim, Daven Tralie, Christopher J Krasileva, Ksenia |
author_facet | Xu, Boyan Cerbu, Alois Lim, Daven Tralie, Christopher J Krasileva, Ksenia |
author_sort | Xu, Boyan |
collection | PubMed |
description | Protein domain annotation is typically done by predictive models such as HMMs trained on sequence motifs. However, sequence-based annotation methods are prone to error, particularly in calling domain boundaries and motifs within them. These methods are limited by a lack of structural information accessible to the model. With the advent of deep learning-based protein structure prediction, existing sequenced-based domain annotation methods can be improved by taking into account the geometry of protein structures. We develop dimensionality reduction methods to annotate repeat units of the Leucine Rich Repeat solenoid domain. The methods are able to correct mistakes made by existing machine learning-based annotation tools and enable the automated detection of hairpin loops and structural anomalies in the solenoid. The methods are applied to 127 predicted structures of LRR-containing intracellular innate immune proteins in the model plant Arabidopsis thaliana and validated against a benchmark dataset of 172 manually-annotated LRR domains. |
format | Online Article Text |
id | pubmed-10634995 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-106349952023-11-13 Structure-Aware Annotation of Leucine-rich Repeat Domains Xu, Boyan Cerbu, Alois Lim, Daven Tralie, Christopher J Krasileva, Ksenia bioRxiv Article Protein domain annotation is typically done by predictive models such as HMMs trained on sequence motifs. However, sequence-based annotation methods are prone to error, particularly in calling domain boundaries and motifs within them. These methods are limited by a lack of structural information accessible to the model. With the advent of deep learning-based protein structure prediction, existing sequenced-based domain annotation methods can be improved by taking into account the geometry of protein structures. We develop dimensionality reduction methods to annotate repeat units of the Leucine Rich Repeat solenoid domain. The methods are able to correct mistakes made by existing machine learning-based annotation tools and enable the automated detection of hairpin loops and structural anomalies in the solenoid. The methods are applied to 127 predicted structures of LRR-containing intracellular innate immune proteins in the model plant Arabidopsis thaliana and validated against a benchmark dataset of 172 manually-annotated LRR domains. Cold Spring Harbor Laboratory 2023-11-01 /pmc/articles/PMC10634995/ /pubmed/37961157 http://dx.doi.org/10.1101/2023.10.27.562987 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Xu, Boyan Cerbu, Alois Lim, Daven Tralie, Christopher J Krasileva, Ksenia Structure-Aware Annotation of Leucine-rich Repeat Domains |
title | Structure-Aware Annotation of Leucine-rich Repeat Domains |
title_full | Structure-Aware Annotation of Leucine-rich Repeat Domains |
title_fullStr | Structure-Aware Annotation of Leucine-rich Repeat Domains |
title_full_unstemmed | Structure-Aware Annotation of Leucine-rich Repeat Domains |
title_short | Structure-Aware Annotation of Leucine-rich Repeat Domains |
title_sort | structure-aware annotation of leucine-rich repeat domains |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634995/ https://www.ncbi.nlm.nih.gov/pubmed/37961157 http://dx.doi.org/10.1101/2023.10.27.562987 |
work_keys_str_mv | AT xuboyan structureawareannotationofleucinerichrepeatdomains AT cerbualois structureawareannotationofleucinerichrepeatdomains AT limdaven structureawareannotationofleucinerichrepeatdomains AT traliechristopherj structureawareannotationofleucinerichrepeatdomains AT krasilevaksenia structureawareannotationofleucinerichrepeatdomains |