Cargando…

dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains

Domains are instrumental in facilitating protein interactions with DNA, RNA, small molecules, ions and peptides. Identifying ligand-binding domains within sequences is a critical step in protein function annotation, and the ligand-binding properties of proteins are frequently analyzed based upon whe...

Descripción completa

Detalles Bibliográficos
Autores principales: Etzion-Fuchs, Anat, Todd, David A, Singh, Mona
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287948/
https://www.ncbi.nlm.nih.gov/pubmed/33999210
http://dx.doi.org/10.1093/nar/gkab356
_version_ 1783724007684571136
author Etzion-Fuchs, Anat
Todd, David A
Singh, Mona
author_facet Etzion-Fuchs, Anat
Todd, David A
Singh, Mona
author_sort Etzion-Fuchs, Anat
collection PubMed
description Domains are instrumental in facilitating protein interactions with DNA, RNA, small molecules, ions and peptides. Identifying ligand-binding domains within sequences is a critical step in protein function annotation, and the ligand-binding properties of proteins are frequently analyzed based upon whether they contain one of these domains. To date, however, knowledge of whether and how protein domains interact with ligands has been limited to domains that have been observed in co-crystal structures; this leaves approximately two-thirds of human protein domain families uncharacterized with respect to whether and how they bind DNA, RNA, small molecules, ions and peptides. To fill this gap, we introduce dSPRINT, a novel ensemble machine learning method for predicting whether a domain binds DNA, RNA, small molecules, ions or peptides, along with the positions within it that participate in these types of interactions. In stringent cross-validation testing, we demonstrate that dSPRINT has an excellent performance in uncovering ligand-binding positions and domains. We also apply dSPRINT to newly characterize the molecular functions of domains of unknown function. dSPRINT’s predictions can be transferred from domains to sequences, enabling predictions about the ligand-binding properties of 95% of human genes. The dSPRINT framework and its predictions for 6503 human protein domains are freely available at http://protdomain.princeton.edu/dsprint.
format Online
Article
Text
id pubmed-8287948
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82879482021-07-19 dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains Etzion-Fuchs, Anat Todd, David A Singh, Mona Nucleic Acids Res Methods Online Domains are instrumental in facilitating protein interactions with DNA, RNA, small molecules, ions and peptides. Identifying ligand-binding domains within sequences is a critical step in protein function annotation, and the ligand-binding properties of proteins are frequently analyzed based upon whether they contain one of these domains. To date, however, knowledge of whether and how protein domains interact with ligands has been limited to domains that have been observed in co-crystal structures; this leaves approximately two-thirds of human protein domain families uncharacterized with respect to whether and how they bind DNA, RNA, small molecules, ions and peptides. To fill this gap, we introduce dSPRINT, a novel ensemble machine learning method for predicting whether a domain binds DNA, RNA, small molecules, ions or peptides, along with the positions within it that participate in these types of interactions. In stringent cross-validation testing, we demonstrate that dSPRINT has an excellent performance in uncovering ligand-binding positions and domains. We also apply dSPRINT to newly characterize the molecular functions of domains of unknown function. dSPRINT’s predictions can be transferred from domains to sequences, enabling predictions about the ligand-binding properties of 95% of human genes. The dSPRINT framework and its predictions for 6503 human protein domains are freely available at http://protdomain.princeton.edu/dsprint. Oxford University Press 2021-05-17 /pmc/articles/PMC8287948/ /pubmed/33999210 http://dx.doi.org/10.1093/nar/gkab356 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Etzion-Fuchs, Anat
Todd, David A
Singh, Mona
dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title_full dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title_fullStr dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title_full_unstemmed dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title_short dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains
title_sort dsprint: predicting dna, rna, ion, peptide and small molecule interaction sites within protein domains
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8287948/
https://www.ncbi.nlm.nih.gov/pubmed/33999210
http://dx.doi.org/10.1093/nar/gkab356
work_keys_str_mv AT etzionfuchsanat dsprintpredictingdnarnaionpeptideandsmallmoleculeinteractionsiteswithinproteindomains
AT todddavida dsprintpredictingdnarnaionpeptideandsmallmoleculeinteractionsiteswithinproteindomains
AT singhmona dsprintpredictingdnarnaionpeptideandsmallmoleculeinteractionsiteswithinproteindomains