Cargando…
CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment
MOTIVATION: Protein domains can be viewed as building blocks, essential for understanding structure–function relationships in proteins. However, each domain database classifies protein domains using its own methodology. Thus, in many cases, domain models and boundaries differ from one domain databas...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329740/ https://www.ncbi.nlm.nih.gov/pubmed/37431435 http://dx.doi.org/10.1093/bioadv/vbad081 |
_version_ | 1785070083656122368 |
---|---|
author | Dhondge, Hrishikesh Chauvot de Beauchêne, Isaure Devignes, Marie-Dominique |
author_facet | Dhondge, Hrishikesh Chauvot de Beauchêne, Isaure Devignes, Marie-Dominique |
author_sort | Dhondge, Hrishikesh |
collection | PubMed |
description | MOTIVATION: Protein domains can be viewed as building blocks, essential for understanding structure–function relationships in proteins. However, each domain database classifies protein domains using its own methodology. Thus, in many cases, domain models and boundaries differ from one domain database to the other, raising the question of domain definition and enumeration of true domain instances. RESULTS: We propose an automated iterative workflow to assess protein domain classification by cross-mapping domain structural instances between domain databases and by evaluating structural alignments. CroMaSt (for Cross-Mapper of domain Structural instances) will classify all experimental structural instances of a given domain type into four different categories (‘Core’, ‘True’, ‘Domain-like’ and ‘Failed’). CroMast is developed in Common Workflow Language and takes advantage of two well-known domain databases with wide coverage: Pfam and CATH. It uses the Kpax structural alignment tool with expert-adjusted parameters. CroMaSt was tested with the RNA Recognition Motif domain type and identifies 962 ‘True’ and 541 ‘Domain-like’ structural instances for this domain type. This method solves a crucial issue in domain-centric research and can generate essential information that could be used for synthetic biology and machine-learning approaches of protein domain engineering. AVAILABILITY AND IMPLEMENTATION: The workflow and the Results archive for the CroMaSt runs presented in this article are available from WorkflowHub (doi: 10.48546/workflowhub.workflow.390.2). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-10329740 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103297402023-07-10 CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment Dhondge, Hrishikesh Chauvot de Beauchêne, Isaure Devignes, Marie-Dominique Bioinform Adv Original Paper MOTIVATION: Protein domains can be viewed as building blocks, essential for understanding structure–function relationships in proteins. However, each domain database classifies protein domains using its own methodology. Thus, in many cases, domain models and boundaries differ from one domain database to the other, raising the question of domain definition and enumeration of true domain instances. RESULTS: We propose an automated iterative workflow to assess protein domain classification by cross-mapping domain structural instances between domain databases and by evaluating structural alignments. CroMaSt (for Cross-Mapper of domain Structural instances) will classify all experimental structural instances of a given domain type into four different categories (‘Core’, ‘True’, ‘Domain-like’ and ‘Failed’). CroMast is developed in Common Workflow Language and takes advantage of two well-known domain databases with wide coverage: Pfam and CATH. It uses the Kpax structural alignment tool with expert-adjusted parameters. CroMaSt was tested with the RNA Recognition Motif domain type and identifies 962 ‘True’ and 541 ‘Domain-like’ structural instances for this domain type. This method solves a crucial issue in domain-centric research and can generate essential information that could be used for synthetic biology and machine-learning approaches of protein domain engineering. AVAILABILITY AND IMPLEMENTATION: The workflow and the Results archive for the CroMaSt runs presented in this article are available from WorkflowHub (doi: 10.48546/workflowhub.workflow.390.2). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-06-27 /pmc/articles/PMC10329740/ /pubmed/37431435 http://dx.doi.org/10.1093/bioadv/vbad081 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Dhondge, Hrishikesh Chauvot de Beauchêne, Isaure Devignes, Marie-Dominique CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title | CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title_full | CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title_fullStr | CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title_full_unstemmed | CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title_short | CroMaSt: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
title_sort | cromast: a workflow for assessing protein domain classification by cross-mapping of structural instances between domain databases and structural alignment |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329740/ https://www.ncbi.nlm.nih.gov/pubmed/37431435 http://dx.doi.org/10.1093/bioadv/vbad081 |
work_keys_str_mv | AT dhondgehrishikesh cromastaworkflowforassessingproteindomainclassificationbycrossmappingofstructuralinstancesbetweendomaindatabasesandstructuralalignment AT chauvotdebeaucheneisaure cromastaworkflowforassessingproteindomainclassificationbycrossmappingofstructuralinstancesbetweendomaindatabasesandstructuralalignment AT devignesmariedominique cromastaworkflowforassessingproteindomainclassificationbycrossmappingofstructuralinstancesbetweendomaindatabasesandstructuralalignment |