Cargando…

Network-based protein structural classification

Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Newaz, Khalique, Ghalehnovi, Mahboobeh, Rahnama, Arash, Antsaklis, Panos J., Milenković, Tijana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7353965/
https://www.ncbi.nlm.nih.gov/pubmed/32742675
http://dx.doi.org/10.1098/rsos.191461
_version_ 1783557995250057216
author Newaz, Khalique
Ghalehnovi, Mahboobeh
Rahnama, Arash
Antsaklis, Panos J.
Milenković, Tijana
author_facet Newaz, Khalique
Ghalehnovi, Mahboobeh
Rahnama, Arash
Antsaklis, Panos J.
Milenković, Tijana
author_sort Newaz, Khalique
collection PubMed
description Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922
format Online
Article
Text
id pubmed-7353965
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-73539652020-07-31 Network-based protein structural classification Newaz, Khalique Ghalehnovi, Mahboobeh Rahnama, Arash Antsaklis, Panos J. Milenković, Tijana R Soc Open Sci Computer Science and Artificial Intelligence Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922 The Royal Society 2020-06-03 /pmc/articles/PMC7353965/ /pubmed/32742675 http://dx.doi.org/10.1098/rsos.191461 Text en © 2020 The Authors. http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/http://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle Computer Science and Artificial Intelligence
Newaz, Khalique
Ghalehnovi, Mahboobeh
Rahnama, Arash
Antsaklis, Panos J.
Milenković, Tijana
Network-based protein structural classification
title Network-based protein structural classification
title_full Network-based protein structural classification
title_fullStr Network-based protein structural classification
title_full_unstemmed Network-based protein structural classification
title_short Network-based protein structural classification
title_sort network-based protein structural classification
topic Computer Science and Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7353965/
https://www.ncbi.nlm.nih.gov/pubmed/32742675
http://dx.doi.org/10.1098/rsos.191461
work_keys_str_mv AT newazkhalique networkbasedproteinstructuralclassification
AT ghalehnovimahboobeh networkbasedproteinstructuralclassification
AT rahnamaarash networkbasedproteinstructuralclassification
AT antsaklispanosj networkbasedproteinstructuralclassification
AT milenkovictijana networkbasedproteinstructuralclassification