Cargando…
DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants
Utilizing large-scale epigenomics data, deep learning tools can predict the regulatory activity of genomic sequences, annotate non-coding genetic variants, and uncover mechanisms behind complex traits. However, these tools primarily rely on human or mouse data for training, limiting their performanc...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418434/ https://www.ncbi.nlm.nih.gov/pubmed/37569400 http://dx.doi.org/10.3390/ijms241512023 |
_version_ | 1785088263508197376 |
---|---|
author | Ma, Wenlong Fu, Yang Bao, Yongzhou Wang, Zhen Lei, Bowen Zheng, Weigang Wang, Chao Liu, Yuwen |
author_facet | Ma, Wenlong Fu, Yang Bao, Yongzhou Wang, Zhen Lei, Bowen Zheng, Weigang Wang, Chao Liu, Yuwen |
author_sort | Ma, Wenlong |
collection | PubMed |
description | Utilizing large-scale epigenomics data, deep learning tools can predict the regulatory activity of genomic sequences, annotate non-coding genetic variants, and uncover mechanisms behind complex traits. However, these tools primarily rely on human or mouse data for training, limiting their performance when applied to other species. Furthermore, the limited exploration of many species, particularly in the case of livestock, has led to a scarcity of comprehensive and high-quality epigenetic data, posing challenges in developing reliable deep learning models for decoding their non-coding genomes. The cross-species prediction of the regulatory genome can be achieved by leveraging publicly available data from extensively studied organisms and making use of the conserved DNA binding preferences of transcription factors within the same tissue. In this study, we introduced DeepSATA, a novel deep learning-based sequence analyzer that incorporates the transcription factor binding affinity for the cross-species prediction of chromatin accessibility. By applying DeepSATA to analyze the genomes of pigs, chickens, cattle, humans, and mice, we demonstrated its ability to improve the prediction accuracy of chromatin accessibility and achieve reliable cross-species predictions in animals. Additionally, we showcased its effectiveness in analyzing pig genetic variants associated with economic traits and in increasing the accuracy of genomic predictions. Overall, our study presents a valuable tool to explore the epigenomic landscape of various species and pinpoint regulatory deoxyribonucleic acid (DNA) variants associated with complex traits. |
format | Online Article Text |
id | pubmed-10418434 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104184342023-08-12 DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants Ma, Wenlong Fu, Yang Bao, Yongzhou Wang, Zhen Lei, Bowen Zheng, Weigang Wang, Chao Liu, Yuwen Int J Mol Sci Article Utilizing large-scale epigenomics data, deep learning tools can predict the regulatory activity of genomic sequences, annotate non-coding genetic variants, and uncover mechanisms behind complex traits. However, these tools primarily rely on human or mouse data for training, limiting their performance when applied to other species. Furthermore, the limited exploration of many species, particularly in the case of livestock, has led to a scarcity of comprehensive and high-quality epigenetic data, posing challenges in developing reliable deep learning models for decoding their non-coding genomes. The cross-species prediction of the regulatory genome can be achieved by leveraging publicly available data from extensively studied organisms and making use of the conserved DNA binding preferences of transcription factors within the same tissue. In this study, we introduced DeepSATA, a novel deep learning-based sequence analyzer that incorporates the transcription factor binding affinity for the cross-species prediction of chromatin accessibility. By applying DeepSATA to analyze the genomes of pigs, chickens, cattle, humans, and mice, we demonstrated its ability to improve the prediction accuracy of chromatin accessibility and achieve reliable cross-species predictions in animals. Additionally, we showcased its effectiveness in analyzing pig genetic variants associated with economic traits and in increasing the accuracy of genomic predictions. Overall, our study presents a valuable tool to explore the epigenomic landscape of various species and pinpoint regulatory deoxyribonucleic acid (DNA) variants associated with complex traits. MDPI 2023-07-27 /pmc/articles/PMC10418434/ /pubmed/37569400 http://dx.doi.org/10.3390/ijms241512023 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Ma, Wenlong Fu, Yang Bao, Yongzhou Wang, Zhen Lei, Bowen Zheng, Weigang Wang, Chao Liu, Yuwen DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title | DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title_full | DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title_fullStr | DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title_full_unstemmed | DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title_short | DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants |
title_sort | deepsata: a deep learning-based sequence analyzer incorporating the transcription factor binding affinity to dissect the effects of non-coding genetic variants |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418434/ https://www.ncbi.nlm.nih.gov/pubmed/37569400 http://dx.doi.org/10.3390/ijms241512023 |
work_keys_str_mv | AT mawenlong deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT fuyang deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT baoyongzhou deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT wangzhen deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT leibowen deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT zhengweigang deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT wangchao deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants AT liuyuwen deepsataadeeplearningbasedsequenceanalyzerincorporatingthetranscriptionfactorbindingaffinitytodissecttheeffectsofnoncodinggeneticvariants |