Cargando…
TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions
MOTIVATION: High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10640393/ https://www.ncbi.nlm.nih.gov/pubmed/37897686 http://dx.doi.org/10.1093/bioinformatics/btad663 |
_version_ | 1785133745762729984 |
---|---|
author | Sledzieski, Samuel Devkota, Kapil Singh, Rohit Cowen, Lenore Berger, Bonnie |
author_facet | Sledzieski, Samuel Devkota, Kapil Singh, Rohit Cowen, Lenore Berger, Bonnie |
author_sort | Sledzieski, Samuel |
collection | PubMed |
description | MOTIVATION: High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di). RESULTS: We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein–protein interactions cross-species. Thus TT3D (Topsy-Turvy 3D) presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary protein–protein interaction predictions across all protein pairs can be made genome-wide. AVAILABILITY AND IMPLEMENTATION: TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674. |
format | Online Article Text |
id | pubmed-10640393 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106403932023-10-28 TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions Sledzieski, Samuel Devkota, Kapil Singh, Rohit Cowen, Lenore Berger, Bonnie Bioinformatics Applications Note MOTIVATION: High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di). RESULTS: We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein–protein interactions cross-species. Thus TT3D (Topsy-Turvy 3D) presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary protein–protein interaction predictions across all protein pairs can be made genome-wide. AVAILABILITY AND IMPLEMENTATION: TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674. Oxford University Press 2023-10-28 /pmc/articles/PMC10640393/ /pubmed/37897686 http://dx.doi.org/10.1093/bioinformatics/btad663 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Sledzieski, Samuel Devkota, Kapil Singh, Rohit Cowen, Lenore Berger, Bonnie TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title | TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title_full | TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title_fullStr | TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title_full_unstemmed | TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title_short | TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions |
title_sort | tt3d: leveraging precomputed protein 3d sequence models to predict protein–protein interactions |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10640393/ https://www.ncbi.nlm.nih.gov/pubmed/37897686 http://dx.doi.org/10.1093/bioinformatics/btad663 |
work_keys_str_mv | AT sledzieskisamuel tt3dleveragingprecomputedprotein3dsequencemodelstopredictproteinproteininteractions AT devkotakapil tt3dleveragingprecomputedprotein3dsequencemodelstopredictproteinproteininteractions AT singhrohit tt3dleveragingprecomputedprotein3dsequencemodelstopredictproteinproteininteractions AT cowenlenore tt3dleveragingprecomputedprotein3dsequencemodelstopredictproteinproteininteractions AT bergerbonnie tt3dleveragingprecomputedprotein3dsequencemodelstopredictproteinproteininteractions |