Cargando…

PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure

MOTIVATION: AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts...

Descripción completa

Detalles Bibliográficos
Autores principales: Townsley, Thomas D, Wilson, James T, Akers, Harrison, Bryant, Timothy, Cordova, Salvador, Wallace, T L, Durston, Kirk K, Deweese, Joseph E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710643/
https://www.ncbi.nlm.nih.gov/pubmed/36699404
http://dx.doi.org/10.1093/bioadv/vbac058
_version_ 1784841410874179584
author Townsley, Thomas D
Wilson, James T
Akers, Harrison
Bryant, Timothy
Cordova, Salvador
Wallace, T L
Durston, Kirk K
Deweese, Joseph E
author_facet Townsley, Thomas D
Wilson, James T
Akers, Harrison
Bryant, Timothy
Cordova, Salvador
Wallace, T L
Durston, Kirk K
Deweese, Joseph E
author_sort Townsley, Thomas D
collection PubMed
description MOTIVATION: AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. RESULTS: We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. AVAILABILITY AND IMPLEMENTATION: https://github.com/jdeweeselab/psicalc-package SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710643
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106432023-01-24 PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure Townsley, Thomas D Wilson, James T Akers, Harrison Bryant, Timothy Cordova, Salvador Wallace, T L Durston, Kirk K Deweese, Joseph E Bioinform Adv Original Paper MOTIVATION: AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. RESULTS: We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. AVAILABILITY AND IMPLEMENTATION: https://github.com/jdeweeselab/psicalc-package SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2022-08-18 /pmc/articles/PMC9710643/ /pubmed/36699404 http://dx.doi.org/10.1093/bioadv/vbac058 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Townsley, Thomas D
Wilson, James T
Akers, Harrison
Bryant, Timothy
Cordova, Salvador
Wallace, T L
Durston, Kirk K
Deweese, Joseph E
PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title_full PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title_fullStr PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title_full_unstemmed PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title_short PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
title_sort psicalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710643/
https://www.ncbi.nlm.nih.gov/pubmed/36699404
http://dx.doi.org/10.1093/bioadv/vbac058
work_keys_str_mv AT townsleythomasd psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT wilsonjamest psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT akersharrison psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT bryanttimothy psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT cordovasalvador psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT wallacetl psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT durstonkirkk psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure
AT deweesejosephe psicalcanovelapproachtoidentifyingandrankingcriticalnonproximalinterdependencieswithintheoverallproteinstructure