Cargando…

Integrative analysis and prediction of human R-loop binding proteins

In the past decade, there has been a growing appreciation for R-loop structures as important regulators of the epigenome, telomere maintenance, DNA repair, and replication. Given these numerous functions, dozens, or potentially hundreds, of proteins could serve as direct or indirect regulators of R-...

Descripción completa

Detalles Bibliográficos
Autores principales: Kumar, Arun, Fournier, Louis-Alexandre, Stirling, Peter C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9339281/
https://www.ncbi.nlm.nih.gov/pubmed/35666183
http://dx.doi.org/10.1093/g3journal/jkac142
_version_ 1784760155794046976
author Kumar, Arun
Fournier, Louis-Alexandre
Stirling, Peter C
author_facet Kumar, Arun
Fournier, Louis-Alexandre
Stirling, Peter C
author_sort Kumar, Arun
collection PubMed
description In the past decade, there has been a growing appreciation for R-loop structures as important regulators of the epigenome, telomere maintenance, DNA repair, and replication. Given these numerous functions, dozens, or potentially hundreds, of proteins could serve as direct or indirect regulators of R-loop writing, reading, and erasing. In order to understand common properties shared amongst potential R-loop binding proteins, we mined published proteomic studies and distilled 10 features that were enriched in R-loop binding proteins compared with the rest of the proteome. Applying an easy-ensemble machine learning approach, we used these R-loop binding protein-specific features along with their amino acid composition to create random forest classifiers that predict the likelihood of a protein to bind to R-loops. Known R-loop regulating pathways such as splicing, DNA damage repair and chromatin remodeling are highly enriched in our datasets, and we validate 2 new R-loop binding proteins LIG1 and FXR1 in human cells. Together these datasets provide a reference to pursue analyses of novel R-loop regulatory proteins.
format Online
Article
Text
id pubmed-9339281
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93392812022-08-01 Integrative analysis and prediction of human R-loop binding proteins Kumar, Arun Fournier, Louis-Alexandre Stirling, Peter C G3 (Bethesda) Investigation In the past decade, there has been a growing appreciation for R-loop structures as important regulators of the epigenome, telomere maintenance, DNA repair, and replication. Given these numerous functions, dozens, or potentially hundreds, of proteins could serve as direct or indirect regulators of R-loop writing, reading, and erasing. In order to understand common properties shared amongst potential R-loop binding proteins, we mined published proteomic studies and distilled 10 features that were enriched in R-loop binding proteins compared with the rest of the proteome. Applying an easy-ensemble machine learning approach, we used these R-loop binding protein-specific features along with their amino acid composition to create random forest classifiers that predict the likelihood of a protein to bind to R-loops. Known R-loop regulating pathways such as splicing, DNA damage repair and chromatin remodeling are highly enriched in our datasets, and we validate 2 new R-loop binding proteins LIG1 and FXR1 in human cells. Together these datasets provide a reference to pursue analyses of novel R-loop regulatory proteins. Oxford University Press 2022-06-06 /pmc/articles/PMC9339281/ /pubmed/35666183 http://dx.doi.org/10.1093/g3journal/jkac142 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Kumar, Arun
Fournier, Louis-Alexandre
Stirling, Peter C
Integrative analysis and prediction of human R-loop binding proteins
title Integrative analysis and prediction of human R-loop binding proteins
title_full Integrative analysis and prediction of human R-loop binding proteins
title_fullStr Integrative analysis and prediction of human R-loop binding proteins
title_full_unstemmed Integrative analysis and prediction of human R-loop binding proteins
title_short Integrative analysis and prediction of human R-loop binding proteins
title_sort integrative analysis and prediction of human r-loop binding proteins
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9339281/
https://www.ncbi.nlm.nih.gov/pubmed/35666183
http://dx.doi.org/10.1093/g3journal/jkac142
work_keys_str_mv AT kumararun integrativeanalysisandpredictionofhumanrloopbindingproteins
AT fournierlouisalexandre integrativeanalysisandpredictionofhumanrloopbindingproteins
AT stirlingpeterc integrativeanalysisandpredictionofhumanrloopbindingproteins