Cargando…

Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development

BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Shang, Cheng, Hailiang, Ashraf, Javaria, Zhang, Youping, Wang, Qiaolian, Lv, Limin, He, Man, Song, Guoli, Zuo, Dongyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922751/
https://www.ncbi.nlm.nih.gov/pubmed/35291940
http://dx.doi.org/10.1186/s12859-022-04619-9
_version_ 1784669556801798144
author Liu, Shang
Cheng, Hailiang
Ashraf, Javaria
Zhang, Youping
Wang, Qiaolian
Lv, Limin
He, Man
Song, Guoli
Zuo, Dongyun
author_facet Liu, Shang
Cheng, Hailiang
Ashraf, Javaria
Zhang, Youping
Wang, Qiaolian
Lv, Limin
He, Man
Song, Guoli
Zuo, Dongyun
author_sort Liu, Shang
collection PubMed
description BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional neural networks to predict gene expression status based on the sequences of gene transcription start regions. After that, a gradient-based interpretation and an N-adjusted kernel transformation were implemented to extract sequence features contributing to transcription. RESULTS: Our models had approximate 80% accuracies, and the area under the receiver operating characteristic curve reached over 0.85. Gradient-based interpretation revealed 5' untranslated region contributed to gene transcription. Furthermore, 6 DOF binding motifs and 4 transcription activator binding motifs were obtained by N-adjusted kernel-motif transformation from models in three developmental stages. Apart from 10 general motifs, 3 DOF5.1 genes were also detected. In silico analysis about these motifs’ binding proteins implied their potential functions in fiber formation. Besides, we also found some novel motifs in plants as important sequence features for transcription. CONCLUSIONS: In conclusion, the N-adjusted kernel transformation method could interpret convolutional neural networks and reveal important sequence features related to transcription during fiber development. Potential functions of motifs interpreted from convolutional neural networks could be validated by further wet-lab experiments and applied in cotton molecular breeding. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04619-9.
format Online
Article
Text
id pubmed-8922751
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-89227512022-03-22 Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development Liu, Shang Cheng, Hailiang Ashraf, Javaria Zhang, Youping Wang, Qiaolian Lv, Limin He, Man Song, Guoli Zuo, Dongyun BMC Bioinformatics Research BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional neural networks to predict gene expression status based on the sequences of gene transcription start regions. After that, a gradient-based interpretation and an N-adjusted kernel transformation were implemented to extract sequence features contributing to transcription. RESULTS: Our models had approximate 80% accuracies, and the area under the receiver operating characteristic curve reached over 0.85. Gradient-based interpretation revealed 5' untranslated region contributed to gene transcription. Furthermore, 6 DOF binding motifs and 4 transcription activator binding motifs were obtained by N-adjusted kernel-motif transformation from models in three developmental stages. Apart from 10 general motifs, 3 DOF5.1 genes were also detected. In silico analysis about these motifs’ binding proteins implied their potential functions in fiber formation. Besides, we also found some novel motifs in plants as important sequence features for transcription. CONCLUSIONS: In conclusion, the N-adjusted kernel transformation method could interpret convolutional neural networks and reveal important sequence features related to transcription during fiber development. Potential functions of motifs interpreted from convolutional neural networks could be validated by further wet-lab experiments and applied in cotton molecular breeding. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04619-9. BioMed Central 2022-03-15 /pmc/articles/PMC8922751/ /pubmed/35291940 http://dx.doi.org/10.1186/s12859-022-04619-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/ Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Liu, Shang
Cheng, Hailiang
Ashraf, Javaria
Zhang, Youping
Wang, Qiaolian
Lv, Limin
He, Man
Song, Guoli
Zuo, Dongyun
Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title_full Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title_fullStr Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title_full_unstemmed Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title_short Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
title_sort interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922751/
https://www.ncbi.nlm.nih.gov/pubmed/35291940
http://dx.doi.org/10.1186/s12859-022-04619-9
work_keys_str_mv AT liushang interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT chenghailiang interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT ashrafjavaria interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT zhangyouping interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT wangqiaolian interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT lvlimin interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT heman interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT songguoli interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment
AT zuodongyun interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment