Cargando…

Understanding structure-guided variant effect predictions using 3D convolutional neural networks

Predicting pathogenicity of missense variants in molecular diagnostics remains a challenge despite the available wealth of data, such as evolutionary information, and the wealth of tools to integrate that data. We describe DeepRank-Mut, a configurable framework designed to extract and learn from phy...

Descripción completa

Detalles Bibliográficos
Autores principales: Ramakrishnan, Gayatri, Baakman, Coos, Heijl, Stephan, Vroling, Bas, van Horck, Ragna, Hiraki, Jeffrey, Xue, Li C., Huynen, Martijn A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354367/
https://www.ncbi.nlm.nih.gov/pubmed/37475887
http://dx.doi.org/10.3389/fmolb.2023.1204157
_version_ 1785074912323436544
author Ramakrishnan, Gayatri
Baakman, Coos
Heijl, Stephan
Vroling, Bas
van Horck, Ragna
Hiraki, Jeffrey
Xue, Li C.
Huynen, Martijn A.
author_facet Ramakrishnan, Gayatri
Baakman, Coos
Heijl, Stephan
Vroling, Bas
van Horck, Ragna
Hiraki, Jeffrey
Xue, Li C.
Huynen, Martijn A.
author_sort Ramakrishnan, Gayatri
collection PubMed
description Predicting pathogenicity of missense variants in molecular diagnostics remains a challenge despite the available wealth of data, such as evolutionary information, and the wealth of tools to integrate that data. We describe DeepRank-Mut, a configurable framework designed to extract and learn from physicochemically relevant features of amino acids surrounding missense variants in 3D space. For each variant, various atomic and residue-level features are extracted from its structural environment, including sequence conservation scores of the surrounding amino acids, and stored in multi-channel 3D voxel grids which are then used to train a 3D convolutional neural network (3D-CNN). The resultant model gives a probabilistic estimate of whether a given input variant is disease-causing or benign. We find that the performance of our 3D-CNN model, on independent test datasets, is comparable to other widely used resources which also combine sequence and structural features. Based on the 10-fold cross-validation experiments, we achieve an average accuracy of 0.77 on the independent test datasets. We discuss the contribution of the variant neighborhood in the model’s predictive power, in addition to the impact of individual features on the model’s performance. Two key features: evolutionary information of residues in the variant neighborhood and their solvent accessibilities were observed to influence the predictions. We also highlight how predictions are impacted by the underlying disease mechanisms of missense mutations and offer insights into understanding these to improve pathogenicity predictions. Our study presents aspects to take into consideration when adopting deep learning approaches for protein structure-guided pathogenicity predictions.
format Online
Article
Text
id pubmed-10354367
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103543672023-07-20 Understanding structure-guided variant effect predictions using 3D convolutional neural networks Ramakrishnan, Gayatri Baakman, Coos Heijl, Stephan Vroling, Bas van Horck, Ragna Hiraki, Jeffrey Xue, Li C. Huynen, Martijn A. Front Mol Biosci Molecular Biosciences Predicting pathogenicity of missense variants in molecular diagnostics remains a challenge despite the available wealth of data, such as evolutionary information, and the wealth of tools to integrate that data. We describe DeepRank-Mut, a configurable framework designed to extract and learn from physicochemically relevant features of amino acids surrounding missense variants in 3D space. For each variant, various atomic and residue-level features are extracted from its structural environment, including sequence conservation scores of the surrounding amino acids, and stored in multi-channel 3D voxel grids which are then used to train a 3D convolutional neural network (3D-CNN). The resultant model gives a probabilistic estimate of whether a given input variant is disease-causing or benign. We find that the performance of our 3D-CNN model, on independent test datasets, is comparable to other widely used resources which also combine sequence and structural features. Based on the 10-fold cross-validation experiments, we achieve an average accuracy of 0.77 on the independent test datasets. We discuss the contribution of the variant neighborhood in the model’s predictive power, in addition to the impact of individual features on the model’s performance. Two key features: evolutionary information of residues in the variant neighborhood and their solvent accessibilities were observed to influence the predictions. We also highlight how predictions are impacted by the underlying disease mechanisms of missense mutations and offer insights into understanding these to improve pathogenicity predictions. Our study presents aspects to take into consideration when adopting deep learning approaches for protein structure-guided pathogenicity predictions. Frontiers Media S.A. 2023-07-05 /pmc/articles/PMC10354367/ /pubmed/37475887 http://dx.doi.org/10.3389/fmolb.2023.1204157 Text en Copyright © 2023 Ramakrishnan, Baakman, Heijl, Vroling, van Horck, Hiraki, Xue and Huynen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Molecular Biosciences
Ramakrishnan, Gayatri
Baakman, Coos
Heijl, Stephan
Vroling, Bas
van Horck, Ragna
Hiraki, Jeffrey
Xue, Li C.
Huynen, Martijn A.
Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title_full Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title_fullStr Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title_full_unstemmed Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title_short Understanding structure-guided variant effect predictions using 3D convolutional neural networks
title_sort understanding structure-guided variant effect predictions using 3d convolutional neural networks
topic Molecular Biosciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354367/
https://www.ncbi.nlm.nih.gov/pubmed/37475887
http://dx.doi.org/10.3389/fmolb.2023.1204157
work_keys_str_mv AT ramakrishnangayatri understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT baakmancoos understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT heijlstephan understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT vrolingbas understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT vanhorckragna understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT hirakijeffrey understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT xuelic understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks
AT huynenmartijna understandingstructureguidedvarianteffectpredictionsusing3dconvolutionalneuralnetworks