Cargando…

Real-time structural motif searching in proteins using an inverted index strategy

Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, structural motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site that may be remote from one another in amino acid sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Bittrich, Sebastian, Burley, Stephen K., Rose, Alexander S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7746303/
https://www.ncbi.nlm.nih.gov/pubmed/33284792
http://dx.doi.org/10.1371/journal.pcbi.1008502
_version_ 1783624769479901184
author Bittrich, Sebastian
Burley, Stephen K.
Rose, Alexander S.
author_facet Bittrich, Sebastian
Burley, Stephen K.
Rose, Alexander S.
author_sort Bittrich, Sebastian
collection PubMed
description Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, structural motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site that may be remote from one another in amino acid sequence. Detection of such structural motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing methods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >170,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query motif and ignoring most of the structures that are irrelevant. Our approach (implemented at motif.rcsb.org) enables real-time retrieval and superposition of structural motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids.
format Online
Article
Text
id pubmed-7746303
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77463032020-12-31 Real-time structural motif searching in proteins using an inverted index strategy Bittrich, Sebastian Burley, Stephen K. Rose, Alexander S. PLoS Comput Biol Research Article Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, structural motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site that may be remote from one another in amino acid sequence. Detection of such structural motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing methods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >170,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query motif and ignoring most of the structures that are irrelevant. Our approach (implemented at motif.rcsb.org) enables real-time retrieval and superposition of structural motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids. Public Library of Science 2020-12-07 /pmc/articles/PMC7746303/ /pubmed/33284792 http://dx.doi.org/10.1371/journal.pcbi.1008502 Text en © 2020 Bittrich et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bittrich, Sebastian
Burley, Stephen K.
Rose, Alexander S.
Real-time structural motif searching in proteins using an inverted index strategy
title Real-time structural motif searching in proteins using an inverted index strategy
title_full Real-time structural motif searching in proteins using an inverted index strategy
title_fullStr Real-time structural motif searching in proteins using an inverted index strategy
title_full_unstemmed Real-time structural motif searching in proteins using an inverted index strategy
title_short Real-time structural motif searching in proteins using an inverted index strategy
title_sort real-time structural motif searching in proteins using an inverted index strategy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7746303/
https://www.ncbi.nlm.nih.gov/pubmed/33284792
http://dx.doi.org/10.1371/journal.pcbi.1008502
work_keys_str_mv AT bittrichsebastian realtimestructuralmotifsearchinginproteinsusinganinvertedindexstrategy
AT burleystephenk realtimestructuralmotifsearchinginproteinsusinganinvertedindexstrategy
AT rosealexanders realtimestructuralmotifsearchinginproteinsusinganinvertedindexstrategy