Cargando…

MADOKA: an ultra-fast approach for large-scale protein structure similarity searching

BACKGROUND: Protein comparative analysis and similarity searches play essential roles in structural bioinformatics. A couple of algorithms for protein structure alignments have been developed in recent years. However, facing the rapid growth of protein structure data, improving overall comparison pe...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Lei, Zhong, Guolun, Liu, Chenzhe, Luo, Judong, Liu, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929402/
https://www.ncbi.nlm.nih.gov/pubmed/31870277
http://dx.doi.org/10.1186/s12859-019-3235-1
_version_ 1783482691788734464
author Deng, Lei
Zhong, Guolun
Liu, Chenzhe
Luo, Judong
Liu, Hui
author_facet Deng, Lei
Zhong, Guolun
Liu, Chenzhe
Luo, Judong
Liu, Hui
author_sort Deng, Lei
collection PubMed
description BACKGROUND: Protein comparative analysis and similarity searches play essential roles in structural bioinformatics. A couple of algorithms for protein structure alignments have been developed in recent years. However, facing the rapid growth of protein structure data, improving overall comparison performance and running efficiency with massive sequences is still challenging. RESULTS: Here, we propose MADOKA, an ultra-fast approach for massive structural neighbor searching using a novel two-phase algorithm. Initially, we apply a fast alignment between pairwise structures. Then, we employ a score to select pairs with more similarity to carry out a more accurate fragment-based residue-level alignment. MADOKA performs about 6–100 times faster than existing methods, including TM-align and SAL, in massive alignments. Moreover, the quality of structural alignment of MADOKA is better than the existing algorithms in terms of TM-score and number of aligned residues. We also develop a web server to search structural neighbors in PDB database (About 360,000 protein chains in total), as well as additional features such as 3D structure alignment visualization. The MADOKA web server is freely available at: http://madoka.denglab.org/ CONCLUSIONS: MADOKA is an efficient approach to search for protein structure similarity. In addition, we provide a parallel implementation of MADOKA which exploits massive power of multi-core CPUs.
format Online
Article
Text
id pubmed-6929402
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69294022019-12-30 MADOKA: an ultra-fast approach for large-scale protein structure similarity searching Deng, Lei Zhong, Guolun Liu, Chenzhe Luo, Judong Liu, Hui BMC Bioinformatics Methodology BACKGROUND: Protein comparative analysis and similarity searches play essential roles in structural bioinformatics. A couple of algorithms for protein structure alignments have been developed in recent years. However, facing the rapid growth of protein structure data, improving overall comparison performance and running efficiency with massive sequences is still challenging. RESULTS: Here, we propose MADOKA, an ultra-fast approach for massive structural neighbor searching using a novel two-phase algorithm. Initially, we apply a fast alignment between pairwise structures. Then, we employ a score to select pairs with more similarity to carry out a more accurate fragment-based residue-level alignment. MADOKA performs about 6–100 times faster than existing methods, including TM-align and SAL, in massive alignments. Moreover, the quality of structural alignment of MADOKA is better than the existing algorithms in terms of TM-score and number of aligned residues. We also develop a web server to search structural neighbors in PDB database (About 360,000 protein chains in total), as well as additional features such as 3D structure alignment visualization. The MADOKA web server is freely available at: http://madoka.denglab.org/ CONCLUSIONS: MADOKA is an efficient approach to search for protein structure similarity. In addition, we provide a parallel implementation of MADOKA which exploits massive power of multi-core CPUs. BioMed Central 2019-12-24 /pmc/articles/PMC6929402/ /pubmed/31870277 http://dx.doi.org/10.1186/s12859-019-3235-1 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Deng, Lei
Zhong, Guolun
Liu, Chenzhe
Luo, Judong
Liu, Hui
MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title_full MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title_fullStr MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title_full_unstemmed MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title_short MADOKA: an ultra-fast approach for large-scale protein structure similarity searching
title_sort madoka: an ultra-fast approach for large-scale protein structure similarity searching
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929402/
https://www.ncbi.nlm.nih.gov/pubmed/31870277
http://dx.doi.org/10.1186/s12859-019-3235-1
work_keys_str_mv AT denglei madokaanultrafastapproachforlargescaleproteinstructuresimilaritysearching
AT zhongguolun madokaanultrafastapproachforlargescaleproteinstructuresimilaritysearching
AT liuchenzhe madokaanultrafastapproachforlargescaleproteinstructuresimilaritysearching
AT luojudong madokaanultrafastapproachforlargescaleproteinstructuresimilaritysearching
AT liuhui madokaanultrafastapproachforlargescaleproteinstructuresimilaritysearching