Cargando…

Efficient protein structure search using indexing methods

Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a la...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Sungchul, Sael, Lee, Yu, Hwanjo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618241/
https://www.ncbi.nlm.nih.gov/pubmed/23691543
http://dx.doi.org/10.1186/1472-6947-13-S1-S8
_version_ 1782265384473198592
author Kim, Sungchul
Sael, Lee
Yu, Hwanjo
author_facet Kim, Sungchul
Sael, Lee
Yu, Hwanjo
author_sort Kim, Sungchul
collection PubMed
description Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.
format Online
Article
Text
id pubmed-3618241
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36182412013-04-09 Efficient protein structure search using indexing methods Kim, Sungchul Sael, Lee Yu, Hwanjo BMC Med Inform Decis Mak Proceedings Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively. BioMed Central 2013-04-05 /pmc/articles/PMC3618241/ /pubmed/23691543 http://dx.doi.org/10.1186/1472-6947-13-S1-S8 Text en Copyright © 2013 Kim et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Kim, Sungchul
Sael, Lee
Yu, Hwanjo
Efficient protein structure search using indexing methods
title Efficient protein structure search using indexing methods
title_full Efficient protein structure search using indexing methods
title_fullStr Efficient protein structure search using indexing methods
title_full_unstemmed Efficient protein structure search using indexing methods
title_short Efficient protein structure search using indexing methods
title_sort efficient protein structure search using indexing methods
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3618241/
https://www.ncbi.nlm.nih.gov/pubmed/23691543
http://dx.doi.org/10.1186/1472-6947-13-S1-S8
work_keys_str_mv AT kimsungchul efficientproteinstructuresearchusingindexingmethods
AT saellee efficientproteinstructuresearchusingindexingmethods
AT yuhwanjo efficientproteinstructuresearchusingindexingmethods