Cargando…

CC(+) : A searchable database of validated coiled coils in PDB structures and AlphaFold2 models

α‐Helical coiled coils are common tertiary and quaternary elements of protein structure. In coiled coils, two or more α helices wrap around each other to form bundles. This apparently simple structural motif can generate many architectures and topologies. Coiled coil‐forming sequences can be predict...

Descripción completa

Detalles Bibliográficos
Autores principales: Kumar, Prasun, Petrenas, Rokas, Dawson, William M., Schweke, Hugo, Levy, Emmanuel D., Woolfson, Derek N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10588367/
https://www.ncbi.nlm.nih.gov/pubmed/37768271
http://dx.doi.org/10.1002/pro.4789
Descripción
Sumario:α‐Helical coiled coils are common tertiary and quaternary elements of protein structure. In coiled coils, two or more α helices wrap around each other to form bundles. This apparently simple structural motif can generate many architectures and topologies. Coiled coil‐forming sequences can be predicted from heptad repeats of hydrophobic and polar residues, hpphppp , although this is not always reliable. Alternatively, coiled‐coil structures can be identified using the program SOCKET, which finds knobs‐into‐holes (KIH) packing between side chains of neighboring helices. SOCKET also classifies coiled‐coil architecture and topology, thus allowing sequence‐to‐structure relationships to be garnered. In 2009, we used SOCKET to create a relational database of coiled‐coil structures, CC(+), from the RCSB Protein Data Bank (PDB). Here, we report an update of CC(+) following an update of SOCKET (to Socket2) and the recent explosion of structural data and the success of AlphaFold2 in predicting protein structures from genome sequences. With the most‐stringent SOCKET parameters, CC(+) contains ≈12,000 coiled‐coil assemblies from experimentally determined structures, and ≈120,000 potential coiled‐coil structures within single‐chain models predicted by AlphaFold2 across 48 proteomes. CC(+) allows these and other less‐stringently defined coiled coils to be searched at various levels of structure, sequence, and side‐chain interactions. The identified coiled coils can be viewed directly from CC(+) using the Socket2 application, and their associated data can be downloaded for further analyses. CC(+) is available freely at http://coiledcoils.chm.bris.ac.uk/CCPlus/Home.html. It will be updated automatically. We envisage that CC+ could be used to understand coiled‐coil assemblies and their sequence‐to‐structure relationships, and to aid protein design and engineering.