Cargando…

Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction

MOTIVATION: Despite experimental and curation efforts, the extent of enzyme promiscuity on substrates continues to be largely unexplored and under documented. Providing computational tools for the exploration of the enzyme–substrate interaction space can expedite experimentation and benefit applicat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Xinmeng, Liu, Li-Ping, Hassoun, Soha
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113267/ https://www.ncbi.nlm.nih.gov/pubmed/35561204 http://dx.doi.org/10.1093/bioinformatics/btac201

_version_	1784709554369462272
author	Li, Xinmeng Liu, Li-Ping Hassoun, Soha
author_facet	Li, Xinmeng Liu, Li-Ping Hassoun, Soha
author_sort	Li, Xinmeng
collection	PubMed
description	MOTIVATION: Despite experimental and curation efforts, the extent of enzyme promiscuity on substrates continues to be largely unexplored and under documented. Providing computational tools for the exploration of the enzyme–substrate interaction space can expedite experimentation and benefit applications such as constructing synthesis pathways for novel biomolecules, identifying products of metabolism on ingested compounds, and elucidating xenobiotic metabolism. Recommender systems (RS), which are currently unexplored for the enzyme–substrate interaction prediction problem, can be utilized to provide enzyme recommendations for substrates, and vice versa. The performance of Collaborative-Filtering (CF) RSs; however, hinges on the quality of embedding vectors of users and items (enzymes and substrates in our case). Importantly, enhancing CF embeddings with heterogeneous auxiliary data, specially relational data (e.g. hierarchical, pairwise or groupings), remains a challenge. RESULTS: We propose an innovative general RS framework, termed Boost-RS that enhances RS performance by ‘boosting’ embedding vectors through auxiliary data. Specifically, Boost-RS is trained and dynamically tuned on multiple relevant auxiliary learning tasks Boost-RS utilizes contrastive learning tasks to exploit relational data. To show the efficacy of Boost-RS for the enzyme–substrate prediction interaction problem, we apply the Boost-RS framework to several baseline CF models. We show that each of our auxiliary tasks boosts learning of the embedding vectors, and that contrastive learning using Boost-RS outperforms attribute concatenation and multi-label learning. We also show that Boost-RS outperforms similarity-based models. Ablation studies and visualization of learned representations highlight the importance of using contrastive learning on some of the auxiliary data in boosting the embedding vectors. AVAILABILITY AND IMPLEMENTATION: A Python implementation for Boost-RS is provided at https://github.com/HassounLab/Boost-RS. The enzyme-substrate interaction data is available from the KEGG database (https://www.genome.jp/kegg/).
format	Online Article Text
id	pubmed-9113267
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-91132672022-05-18 Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction Li, Xinmeng Liu, Li-Ping Hassoun, Soha Bioinformatics Original Papers MOTIVATION: Despite experimental and curation efforts, the extent of enzyme promiscuity on substrates continues to be largely unexplored and under documented. Providing computational tools for the exploration of the enzyme–substrate interaction space can expedite experimentation and benefit applications such as constructing synthesis pathways for novel biomolecules, identifying products of metabolism on ingested compounds, and elucidating xenobiotic metabolism. Recommender systems (RS), which are currently unexplored for the enzyme–substrate interaction prediction problem, can be utilized to provide enzyme recommendations for substrates, and vice versa. The performance of Collaborative-Filtering (CF) RSs; however, hinges on the quality of embedding vectors of users and items (enzymes and substrates in our case). Importantly, enhancing CF embeddings with heterogeneous auxiliary data, specially relational data (e.g. hierarchical, pairwise or groupings), remains a challenge. RESULTS: We propose an innovative general RS framework, termed Boost-RS that enhances RS performance by ‘boosting’ embedding vectors through auxiliary data. Specifically, Boost-RS is trained and dynamically tuned on multiple relevant auxiliary learning tasks Boost-RS utilizes contrastive learning tasks to exploit relational data. To show the efficacy of Boost-RS for the enzyme–substrate prediction interaction problem, we apply the Boost-RS framework to several baseline CF models. We show that each of our auxiliary tasks boosts learning of the embedding vectors, and that contrastive learning using Boost-RS outperforms attribute concatenation and multi-label learning. We also show that Boost-RS outperforms similarity-based models. Ablation studies and visualization of learned representations highlight the importance of using contrastive learning on some of the auxiliary data in boosting the embedding vectors. AVAILABILITY AND IMPLEMENTATION: A Python implementation for Boost-RS is provided at https://github.com/HassounLab/Boost-RS. The enzyme-substrate interaction data is available from the KEGG database (https://www.genome.jp/kegg/). Oxford University Press 2022-04-12 /pmc/articles/PMC9113267/ /pubmed/35561204 http://dx.doi.org/10.1093/bioinformatics/btac201 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Original Papers Li, Xinmeng Liu, Li-Ping Hassoun, Soha Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title	Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title_full	Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title_fullStr	Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title_full_unstemmed	Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title_short	Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
title_sort	boost-rs: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113267/ https://www.ncbi.nlm.nih.gov/pubmed/35561204 http://dx.doi.org/10.1093/bioinformatics/btac201
work_keys_str_mv	AT lixinmeng boostrsboostedembeddingsforrecommendersystemsanditsapplicationtoenzymesubstrateinteractionprediction AT liuliping boostrsboostedembeddingsforrecommendersystemsanditsapplicationtoenzymesubstrateinteractionprediction AT hassounsoha boostrsboostedembeddingsforrecommendersystemsanditsapplicationtoenzymesubstrateinteractionprediction

Boost-RS: boosted embeddings for recommender systems and its application to enzyme–substrate interaction prediction

Ejemplares similares