Cargando…

ProtNN: fast and accurate protein 3D-structure classification in structural and topological space

BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Dhifli, Wajdi, Diallo, Abdoulaye Baniré
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034655/
https://www.ncbi.nlm.nih.gov/pubmed/27688811
http://dx.doi.org/10.1186/s13040-016-0108-2
_version_ 1782455316010500096
author Dhifli, Wajdi
Diallo, Abdoulaye Baniré
author_facet Dhifli, Wajdi
Diallo, Abdoulaye Baniré
author_sort Dhifli, Wajdi
collection PubMed
description BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS: We propose ProtNN, a novel classification approach for protein 3D-structures. Given an unannotated query protein structure and a set of annotated proteins, ProtNN assigns to the query protein the class with the highest number of votes across the k nearest neighbor reference proteins, where k is a user-defined parameter. The search of the nearest neighbor annotated structures is based on a protein-graph representation model and pairwise similarities between vector embedding of the query and the reference protein structures in structural and topological spaces. CONCLUSIONS: We demonstrate through an extensive experimental evaluation that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude in runtime compared to state-of-the-art approaches.
format Online
Article
Text
id pubmed-5034655
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50346552016-09-29 ProtNN: fast and accurate protein 3D-structure classification in structural and topological space Dhifli, Wajdi Diallo, Abdoulaye Baniré BioData Min Research BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS: We propose ProtNN, a novel classification approach for protein 3D-structures. Given an unannotated query protein structure and a set of annotated proteins, ProtNN assigns to the query protein the class with the highest number of votes across the k nearest neighbor reference proteins, where k is a user-defined parameter. The search of the nearest neighbor annotated structures is based on a protein-graph representation model and pairwise similarities between vector embedding of the query and the reference protein structures in structural and topological spaces. CONCLUSIONS: We demonstrate through an extensive experimental evaluation that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude in runtime compared to state-of-the-art approaches. BioMed Central 2016-09-23 /pmc/articles/PMC5034655/ /pubmed/27688811 http://dx.doi.org/10.1186/s13040-016-0108-2 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Dhifli, Wajdi
Diallo, Abdoulaye Baniré
ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title_full ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title_fullStr ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title_full_unstemmed ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title_short ProtNN: fast and accurate protein 3D-structure classification in structural and topological space
title_sort protnn: fast and accurate protein 3d-structure classification in structural and topological space
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034655/
https://www.ncbi.nlm.nih.gov/pubmed/27688811
http://dx.doi.org/10.1186/s13040-016-0108-2
work_keys_str_mv AT dhifliwajdi protnnfastandaccurateprotein3dstructureclassificationinstructuralandtopologicalspace
AT dialloabdoulayebanire protnnfastandaccurateprotein3dstructureclassificationinstructuralandtopologicalspace