Cargando…

Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility

Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecu...

Descripción completa

Detalles Bibliográficos
Autores principales: Salvatore, Marco, Horlacher, Marc, Marsico, Annalisa, Winther, Ole, Andersson, Robin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10052367/
https://www.ncbi.nlm.nih.gov/pubmed/37007588
http://dx.doi.org/10.1093/nargab/lqad026
_version_ 1785015144531623936
author Salvatore, Marco
Horlacher, Marc
Marsico, Annalisa
Winther, Ole
Andersson, Robin
author_facet Salvatore, Marco
Horlacher, Marc
Marsico, Annalisa
Winther, Ole
Andersson, Robin
author_sort Salvatore, Marco
collection PubMed
description Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecular data from DNA sequence but are limited to large input data for training. Here, we develop ChromTransfer, a transfer learning method that uses a pre-trained, cell-type agnostic model of open chromatin regions as a basis for fine-tuning on regulatory sequences. We demonstrate superior performances with ChromTransfer for learning cell-type specific chromatin accessibility from sequence compared to models not informed by a pre-trained model. Importantly, ChromTransfer enables fine-tuning on small input data with minimal decrease in accuracy. We show that ChromTransfer uses sequence features matching binding site sequences of key transcription factors for prediction. Together, these results demonstrate ChromTransfer as a promising tool for learning the regulatory code.
format Online
Article
Text
id pubmed-10052367
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-100523672023-03-30 Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility Salvatore, Marco Horlacher, Marc Marsico, Annalisa Winther, Ole Andersson, Robin NAR Genom Bioinform Standard Article Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecular data from DNA sequence but are limited to large input data for training. Here, we develop ChromTransfer, a transfer learning method that uses a pre-trained, cell-type agnostic model of open chromatin regions as a basis for fine-tuning on regulatory sequences. We demonstrate superior performances with ChromTransfer for learning cell-type specific chromatin accessibility from sequence compared to models not informed by a pre-trained model. Importantly, ChromTransfer enables fine-tuning on small input data with minimal decrease in accuracy. We show that ChromTransfer uses sequence features matching binding site sequences of key transcription factors for prediction. Together, these results demonstrate ChromTransfer as a promising tool for learning the regulatory code. Oxford University Press 2023-03-29 /pmc/articles/PMC10052367/ /pubmed/37007588 http://dx.doi.org/10.1093/nargab/lqad026 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Standard Article
Salvatore, Marco
Horlacher, Marc
Marsico, Annalisa
Winther, Ole
Andersson, Robin
Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title_full Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title_fullStr Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title_full_unstemmed Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title_short Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
title_sort transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10052367/
https://www.ncbi.nlm.nih.gov/pubmed/37007588
http://dx.doi.org/10.1093/nargab/lqad026
work_keys_str_mv AT salvatoremarco transferlearningidentifiessequencedeterminantsofcelltypespecificregulatoryelementaccessibility
AT horlachermarc transferlearningidentifiessequencedeterminantsofcelltypespecificregulatoryelementaccessibility
AT marsicoannalisa transferlearningidentifiessequencedeterminantsofcelltypespecificregulatoryelementaccessibility
AT wintherole transferlearningidentifiessequencedeterminantsofcelltypespecificregulatoryelementaccessibility
AT anderssonrobin transferlearningidentifiessequencedeterminantsofcelltypespecificregulatoryelementaccessibility