Cargando…

PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences

The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In mid 2021, the database has almost 180,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and o...

Descripción completa

Detalles Bibliográficos
Autores principales: Faezov, Bulat, Dunbrack, Roland L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8259974/
https://www.ncbi.nlm.nih.gov/pubmed/34228733
http://dx.doi.org/10.1371/journal.pone.0253411
_version_ 1783718746121043968
author Faezov, Bulat
Dunbrack, Roland L.
author_facet Faezov, Bulat
Dunbrack, Roland L.
author_sort Faezov, Bulat
collection PubMed
description The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In mid 2021, the database has almost 180,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and other methods. Many proteins have been studied under different conditions, including binding partners such as ligands, nucleic acids, or other proteins; mutations, and post-translational modifications, thus enabling extensive comparative structure-function studies. However, these studies are made more difficult because authors are allowed by the PDB to number the amino acids in each protein sequence in any manner they wish. This results in the same protein being numbered differently in the available PDB entries. For instance, some authors may include N-terminal signal peptides or the N-terminal methionine in the sequence numbering and others may not. In addition to the coordinates, there are many fields that contain structural and functional information regarding specific residues numbered according to the author. Here we provide a webserver and Python3 application that fixes the PDB sequence numbering problem by replacing the author numbering with numbering derived from the corresponding UniProt sequences. We obtain this correspondence from the SIFTS database from PDBe. The server and program can take a list of PDB entries or a list of UniProt identifiers (e.g., “P04637” or “P53_HUMAN”) and provide renumbered files in mmCIF format and the legacy PDB format for both asymmetric unit files and biological assembly files provided by PDBe.
format Online
Article
Text
id pubmed-8259974
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-82599742021-07-19 PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences Faezov, Bulat Dunbrack, Roland L. PLoS One Research Article The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In mid 2021, the database has almost 180,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and other methods. Many proteins have been studied under different conditions, including binding partners such as ligands, nucleic acids, or other proteins; mutations, and post-translational modifications, thus enabling extensive comparative structure-function studies. However, these studies are made more difficult because authors are allowed by the PDB to number the amino acids in each protein sequence in any manner they wish. This results in the same protein being numbered differently in the available PDB entries. For instance, some authors may include N-terminal signal peptides or the N-terminal methionine in the sequence numbering and others may not. In addition to the coordinates, there are many fields that contain structural and functional information regarding specific residues numbered according to the author. Here we provide a webserver and Python3 application that fixes the PDB sequence numbering problem by replacing the author numbering with numbering derived from the corresponding UniProt sequences. We obtain this correspondence from the SIFTS database from PDBe. The server and program can take a list of PDB entries or a list of UniProt identifiers (e.g., “P04637” or “P53_HUMAN”) and provide renumbered files in mmCIF format and the legacy PDB format for both asymmetric unit files and biological assembly files provided by PDBe. Public Library of Science 2021-07-06 /pmc/articles/PMC8259974/ /pubmed/34228733 http://dx.doi.org/10.1371/journal.pone.0253411 Text en © 2021 Faezov, Dunbrack https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Faezov, Bulat
Dunbrack, Roland L.
PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title_full PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title_fullStr PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title_full_unstemmed PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title_short PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences
title_sort pdbrenum: a webserver and program providing protein data bank files renumbered according to their uniprot sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8259974/
https://www.ncbi.nlm.nih.gov/pubmed/34228733
http://dx.doi.org/10.1371/journal.pone.0253411
work_keys_str_mv AT faezovbulat pdbrenumawebserverandprogramprovidingproteindatabankfilesrenumberedaccordingtotheiruniprotsequences
AT dunbrackrolandl pdbrenumawebserverandprogramprovidingproteindatabankfilesrenumberedaccordingtotheiruniprotsequences