Discovering functionally important sites in proteins

Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlig...

Descripción completa

Detalles Bibliográficos
Autores principales: Cagiada, Matteo, Bottaro, Sandro, Lindemose, Søren, Schenstrøm, Signe M., Stein, Amelie, Hartmann-Petersen, Rasmus, Lindorff-Larsen, Kresten
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10345196/
https://www.ncbi.nlm.nih.gov/pubmed/37443362
http://dx.doi.org/10.1038/s41467-023-39909-0
_version_ 1785073032941797376
author Cagiada, Matteo
Bottaro, Sandro
Lindemose, Søren
Schenstrøm, Signe M.
Stein, Amelie
Hartmann-Petersen, Rasmus
Lindorff-Larsen, Kresten
author_facet Cagiada, Matteo
Bottaro, Sandro
Lindemose, Søren
Schenstrøm, Signe M.
Stein, Amelie
Hartmann-Petersen, Rasmus
Lindorff-Larsen, Kresten
author_sort Cagiada, Matteo
collection PubMed
description Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.
format Online
Article
Text
id pubmed-10345196
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103451962023-07-15 Discovering functionally important sites in proteins Cagiada, Matteo Bottaro, Sandro Lindemose, Søren Schenstrøm, Signe M. Stein, Amelie Hartmann-Petersen, Rasmus Lindorff-Larsen, Kresten Nat Commun Article Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease. Nature Publishing Group UK 2023-07-13 /pmc/articles/PMC10345196/ /pubmed/37443362 http://dx.doi.org/10.1038/s41467-023-39909-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Cagiada, Matteo
Bottaro, Sandro
Lindemose, Søren
Schenstrøm, Signe M.
Stein, Amelie
Hartmann-Petersen, Rasmus
Lindorff-Larsen, Kresten
Discovering functionally important sites in proteins
title Discovering functionally important sites in proteins
title_full Discovering functionally important sites in proteins
title_fullStr Discovering functionally important sites in proteins
title_full_unstemmed Discovering functionally important sites in proteins
title_short Discovering functionally important sites in proteins
title_sort discovering functionally important sites in proteins
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10345196/
https://www.ncbi.nlm.nih.gov/pubmed/37443362
http://dx.doi.org/10.1038/s41467-023-39909-0
work_keys_str_mv AT cagiadamatteo discoveringfunctionallyimportantsitesinproteins
AT bottarosandro discoveringfunctionallyimportantsitesinproteins
AT lindemosesøren discoveringfunctionallyimportantsitesinproteins
AT schenstrømsignem discoveringfunctionallyimportantsitesinproteins
AT steinamelie discoveringfunctionallyimportantsitesinproteins
AT hartmannpetersenrasmus discoveringfunctionallyimportantsitesinproteins
AT lindorfflarsenkresten discoveringfunctionallyimportantsitesinproteins