Cargando…
Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development
BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional ne...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922751/ https://www.ncbi.nlm.nih.gov/pubmed/35291940 http://dx.doi.org/10.1186/s12859-022-04619-9 |
_version_ | 1784669556801798144 |
---|---|
author | Liu, Shang Cheng, Hailiang Ashraf, Javaria Zhang, Youping Wang, Qiaolian Lv, Limin He, Man Song, Guoli Zuo, Dongyun |
author_facet | Liu, Shang Cheng, Hailiang Ashraf, Javaria Zhang, Youping Wang, Qiaolian Lv, Limin He, Man Song, Guoli Zuo, Dongyun |
author_sort | Liu, Shang |
collection | PubMed |
description | BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional neural networks to predict gene expression status based on the sequences of gene transcription start regions. After that, a gradient-based interpretation and an N-adjusted kernel transformation were implemented to extract sequence features contributing to transcription. RESULTS: Our models had approximate 80% accuracies, and the area under the receiver operating characteristic curve reached over 0.85. Gradient-based interpretation revealed 5' untranslated region contributed to gene transcription. Furthermore, 6 DOF binding motifs and 4 transcription activator binding motifs were obtained by N-adjusted kernel-motif transformation from models in three developmental stages. Apart from 10 general motifs, 3 DOF5.1 genes were also detected. In silico analysis about these motifs’ binding proteins implied their potential functions in fiber formation. Besides, we also found some novel motifs in plants as important sequence features for transcription. CONCLUSIONS: In conclusion, the N-adjusted kernel transformation method could interpret convolutional neural networks and reveal important sequence features related to transcription during fiber development. Potential functions of motifs interpreted from convolutional neural networks could be validated by further wet-lab experiments and applied in cotton molecular breeding. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04619-9. |
format | Online Article Text |
id | pubmed-8922751 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-89227512022-03-22 Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development Liu, Shang Cheng, Hailiang Ashraf, Javaria Zhang, Youping Wang, Qiaolian Lv, Limin He, Man Song, Guoli Zuo, Dongyun BMC Bioinformatics Research BACKGROUND: Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional neural networks to predict gene expression status based on the sequences of gene transcription start regions. After that, a gradient-based interpretation and an N-adjusted kernel transformation were implemented to extract sequence features contributing to transcription. RESULTS: Our models had approximate 80% accuracies, and the area under the receiver operating characteristic curve reached over 0.85. Gradient-based interpretation revealed 5' untranslated region contributed to gene transcription. Furthermore, 6 DOF binding motifs and 4 transcription activator binding motifs were obtained by N-adjusted kernel-motif transformation from models in three developmental stages. Apart from 10 general motifs, 3 DOF5.1 genes were also detected. In silico analysis about these motifs’ binding proteins implied their potential functions in fiber formation. Besides, we also found some novel motifs in plants as important sequence features for transcription. CONCLUSIONS: In conclusion, the N-adjusted kernel transformation method could interpret convolutional neural networks and reveal important sequence features related to transcription during fiber development. Potential functions of motifs interpreted from convolutional neural networks could be validated by further wet-lab experiments and applied in cotton molecular breeding. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04619-9. BioMed Central 2022-03-15 /pmc/articles/PMC8922751/ /pubmed/35291940 http://dx.doi.org/10.1186/s12859-022-04619-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/ Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Liu, Shang Cheng, Hailiang Ashraf, Javaria Zhang, Youping Wang, Qiaolian Lv, Limin He, Man Song, Guoli Zuo, Dongyun Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title | Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title_full | Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title_fullStr | Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title_full_unstemmed | Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title_short | Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
title_sort | interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8922751/ https://www.ncbi.nlm.nih.gov/pubmed/35291940 http://dx.doi.org/10.1186/s12859-022-04619-9 |
work_keys_str_mv | AT liushang interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT chenghailiang interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT ashrafjavaria interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT zhangyouping interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT wangqiaolian interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT lvlimin interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT heman interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT songguoli interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment AT zuodongyun interpretationofconvolutionalneuralnetworksrevealscrucialsequencefeaturesinvolvingintranscriptionduringfiberdevelopment |