Cargando…

Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions

Domains are fundamental subunits of proteins, and while they play major roles in facilitating protein–DNA, protein–RNA and other protein–ligand interactions, a systematic assessment of their various interaction modes is still lacking. A comprehensive resource identifying positions within domains tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Kobren, Shilpa Nadimpalli, Singh, Mona
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344845/
https://www.ncbi.nlm.nih.gov/pubmed/30535108
http://dx.doi.org/10.1093/nar/gky1224
_version_ 1783389483837685760
author Kobren, Shilpa Nadimpalli
Singh, Mona
author_facet Kobren, Shilpa Nadimpalli
Singh, Mona
author_sort Kobren, Shilpa Nadimpalli
collection PubMed
description Domains are fundamental subunits of proteins, and while they play major roles in facilitating protein–DNA, protein–RNA and other protein–ligand interactions, a systematic assessment of their various interaction modes is still lacking. A comprehensive resource identifying positions within domains that tend to interact with nucleic acids, small molecules and other ligands would expand our knowledge of domain functionality as well as aid in detecting ligand-binding sites within structurally uncharacterized proteins. Here, we introduce an approach to identify per-domain-position interaction ‘frequencies’ by aggregating protein co-complex structures by domain and ascertaining how often residues mapping to each domain position interact with ligands. We perform this domain-based analysis on ∼91000 co-complex structures, and infer positions involved in binding DNA, RNA, peptides, ions or small molecules across 4128 domains, which we refer to collectively as the InteracDome. Cross-validation testing reveals that ligand-binding positions for 2152 domains are highly consistent and can be used to identify residues facilitating interactions in ∼63–69% of human genes. Our resource of domain-inferred ligand-binding sites should be a great aid in understanding disease etiology: whereas these sites are enriched in Mendelian-associated and cancer somatic mutations, they are depleted in polymorphisms observed across healthy populations. The InteracDome is available at http://interacdome.princeton.edu.
format Online
Article
Text
id pubmed-6344845
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63448452019-01-29 Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions Kobren, Shilpa Nadimpalli Singh, Mona Nucleic Acids Res Data Resources and Analyses Domains are fundamental subunits of proteins, and while they play major roles in facilitating protein–DNA, protein–RNA and other protein–ligand interactions, a systematic assessment of their various interaction modes is still lacking. A comprehensive resource identifying positions within domains that tend to interact with nucleic acids, small molecules and other ligands would expand our knowledge of domain functionality as well as aid in detecting ligand-binding sites within structurally uncharacterized proteins. Here, we introduce an approach to identify per-domain-position interaction ‘frequencies’ by aggregating protein co-complex structures by domain and ascertaining how often residues mapping to each domain position interact with ligands. We perform this domain-based analysis on ∼91000 co-complex structures, and infer positions involved in binding DNA, RNA, peptides, ions or small molecules across 4128 domains, which we refer to collectively as the InteracDome. Cross-validation testing reveals that ligand-binding positions for 2152 domains are highly consistent and can be used to identify residues facilitating interactions in ∼63–69% of human genes. Our resource of domain-inferred ligand-binding sites should be a great aid in understanding disease etiology: whereas these sites are enriched in Mendelian-associated and cancer somatic mutations, they are depleted in polymorphisms observed across healthy populations. The InteracDome is available at http://interacdome.princeton.edu. Oxford University Press 2019-01-25 2018-12-07 /pmc/articles/PMC6344845/ /pubmed/30535108 http://dx.doi.org/10.1093/nar/gky1224 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Resources and Analyses
Kobren, Shilpa Nadimpalli
Singh, Mona
Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title_full Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title_fullStr Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title_full_unstemmed Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title_short Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions
title_sort systematic domain-based aggregation of protein structures highlights dna-, rna- and other ligand-binding positions
topic Data Resources and Analyses
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344845/
https://www.ncbi.nlm.nih.gov/pubmed/30535108
http://dx.doi.org/10.1093/nar/gky1224
work_keys_str_mv AT kobrenshilpanadimpalli systematicdomainbasedaggregationofproteinstructureshighlightsdnarnaandotherligandbindingpositions
AT singhmona systematicdomainbasedaggregationofproteinstructureshighlightsdnarnaandotherligandbindingpositions