Cargando…

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

The intrinsic DNA sequence preferences and cell type–specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type–specific genomic occupancy of a TF...

Descripción completa

Detalles Bibliográficos
Autores principales: Cochran, Kelly, Srivastava, Divyanshi, Shrikumar, Avanti, Balsubramani, Akshay, Hardison, Ross C., Kundaje, Anshul, Mahony, Shaun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896468/
https://www.ncbi.nlm.nih.gov/pubmed/35042722
http://dx.doi.org/10.1101/gr.275394.121
_version_ 1784663172705157120
author Cochran, Kelly
Srivastava, Divyanshi
Shrikumar, Avanti
Balsubramani, Akshay
Hardison, Ross C.
Kundaje, Anshul
Mahony, Shaun
author_facet Cochran, Kelly
Srivastava, Divyanshi
Shrikumar, Avanti
Balsubramani, Akshay
Hardison, Ross C.
Kundaje, Anshul
Mahony, Shaun
author_sort Cochran, Kelly
collection PubMed
description The intrinsic DNA sequence preferences and cell type–specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type–specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species–specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.
format Online
Article
Text
id pubmed-8896468
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-88964682022-09-01 Domain-adaptive neural networks improve cross-species prediction of transcription factor binding Cochran, Kelly Srivastava, Divyanshi Shrikumar, Avanti Balsubramani, Akshay Hardison, Ross C. Kundaje, Anshul Mahony, Shaun Genome Res Method The intrinsic DNA sequence preferences and cell type–specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type–specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species–specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats. Cold Spring Harbor Laboratory Press 2022-03 /pmc/articles/PMC8896468/ /pubmed/35042722 http://dx.doi.org/10.1101/gr.275394.121 Text en © 2022 Cochran et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Cochran, Kelly
Srivastava, Divyanshi
Shrikumar, Avanti
Balsubramani, Akshay
Hardison, Ross C.
Kundaje, Anshul
Mahony, Shaun
Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title_full Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title_fullStr Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title_full_unstemmed Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title_short Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
title_sort domain-adaptive neural networks improve cross-species prediction of transcription factor binding
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896468/
https://www.ncbi.nlm.nih.gov/pubmed/35042722
http://dx.doi.org/10.1101/gr.275394.121
work_keys_str_mv AT cochrankelly domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT srivastavadivyanshi domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT shrikumaravanti domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT balsubramaniakshay domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT hardisonrossc domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT kundajeanshul domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding
AT mahonyshaun domainadaptiveneuralnetworksimprovecrossspeciespredictionoftranscriptionfactorbinding