Cargando…
HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants
Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies o...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9166469/ https://www.ncbi.nlm.nih.gov/pubmed/35669450 http://dx.doi.org/10.1016/j.crstbi.2022.04.004 |
_version_ | 1784720610501328896 |
---|---|
author | Raimondi, Daniele Codicè, Francesco Orlando, Gabriele Schymkowitz, Joost Rousseau, Frederic Moreau, Yves |
author_facet | Raimondi, Daniele Codicè, Francesco Orlando, Gabriele Schymkowitz, Joost Rousseau, Frederic Moreau, Yves |
author_sort | Raimondi, Daniele |
collection | PubMed |
description | Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be. |
format | Online Article Text |
id | pubmed-9166469 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-91664692022-06-05 HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants Raimondi, Daniele Codicè, Francesco Orlando, Gabriele Schymkowitz, Joost Rousseau, Frederic Moreau, Yves Curr Res Struct Biol Research Article Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be. Elsevier 2022-05-13 /pmc/articles/PMC9166469/ /pubmed/35669450 http://dx.doi.org/10.1016/j.crstbi.2022.04.004 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Research Article Raimondi, Daniele Codicè, Francesco Orlando, Gabriele Schymkowitz, Joost Rousseau, Frederic Moreau, Yves HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title | HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title_full | HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title_fullStr | HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title_full_unstemmed | HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title_short | HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants |
title_sort | hpmpdb: a machine learning-ready database of protein molecular phenotypes associated to human missense variants |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9166469/ https://www.ncbi.nlm.nih.gov/pubmed/35669450 http://dx.doi.org/10.1016/j.crstbi.2022.04.004 |
work_keys_str_mv | AT raimondidaniele hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants AT codicefrancesco hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants AT orlandogabriele hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants AT schymkowitzjoost hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants AT rousseaufrederic hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants AT moreauyves hpmpdbamachinelearningreadydatabaseofproteinmolecularphenotypesassociatedtohumanmissensevariants |