Cargando…
DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery
Approximately 15% of human cancers are estimated to be attributed to viruses. Virus sequences can be integrated into the host genome, leading to genomic instability and carcinogenesis. Here, a new deep convolutional neural network (CNN) model is developed with attention architecture, namely DeepVISP...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097320/ https://www.ncbi.nlm.nih.gov/pubmed/33977077 http://dx.doi.org/10.1002/advs.202004958 |
_version_ | 1783688331337400320 |
---|---|
author | Xu, Haodong Jia, Peilin Zhao, Zhongming |
author_facet | Xu, Haodong Jia, Peilin Zhao, Zhongming |
author_sort | Xu, Haodong |
collection | PubMed |
description | Approximately 15% of human cancers are estimated to be attributed to viruses. Virus sequences can be integrated into the host genome, leading to genomic instability and carcinogenesis. Here, a new deep convolutional neural network (CNN) model is developed with attention architecture, namely DeepVISP, for accurately predicting oncogenic virus integration sites (VISs) in the human genome. Using the curated benchmark integration data of three viruses, hepatitis B virus (HBV), human herpesvirus (HPV), and Epstein‐Barr virus (EBV), DeepVISP achieves high accuracy and robust performance for all three viruses through automatically learning informative features and essential genomic positions only from the DNA sequences. In comparison, DeepVISP outperforms conventional machine learning methods by 8.43–34.33% measured by area under curve (AUC) value enhancement in three viruses. Moreover, DeepVISP can decode cis‐regulatory factors that are potentially involved in virus integration and tumorigenesis, such as HOXB7, IKZF1, and LHX6. These findings are supported by multiple lines of evidence in literature. The clustering analysis of the informative motifs reveales that the representative k‐mers in clusters could help guide virus recognition of the host genes. A user‐friendly web server is developed for predicting putative oncogenic VISs in the human genome using DeepVISP. |
format | Online Article Text |
id | pubmed-8097320 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80973202021-05-10 DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery Xu, Haodong Jia, Peilin Zhao, Zhongming Adv Sci (Weinh) Research Articles Approximately 15% of human cancers are estimated to be attributed to viruses. Virus sequences can be integrated into the host genome, leading to genomic instability and carcinogenesis. Here, a new deep convolutional neural network (CNN) model is developed with attention architecture, namely DeepVISP, for accurately predicting oncogenic virus integration sites (VISs) in the human genome. Using the curated benchmark integration data of three viruses, hepatitis B virus (HBV), human herpesvirus (HPV), and Epstein‐Barr virus (EBV), DeepVISP achieves high accuracy and robust performance for all three viruses through automatically learning informative features and essential genomic positions only from the DNA sequences. In comparison, DeepVISP outperforms conventional machine learning methods by 8.43–34.33% measured by area under curve (AUC) value enhancement in three viruses. Moreover, DeepVISP can decode cis‐regulatory factors that are potentially involved in virus integration and tumorigenesis, such as HOXB7, IKZF1, and LHX6. These findings are supported by multiple lines of evidence in literature. The clustering analysis of the informative motifs reveales that the representative k‐mers in clusters could help guide virus recognition of the host genes. A user‐friendly web server is developed for predicting putative oncogenic VISs in the human genome using DeepVISP. John Wiley and Sons Inc. 2021-03-08 /pmc/articles/PMC8097320/ /pubmed/33977077 http://dx.doi.org/10.1002/advs.202004958 Text en © 2021 The Authors. Advanced Science published by Wiley‐VCH GmbH https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Articles Xu, Haodong Jia, Peilin Zhao, Zhongming DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title | DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title_full | DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title_fullStr | DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title_full_unstemmed | DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title_short | DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery |
title_sort | deepvisp: deep learning for virus site integration prediction and motif discovery |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097320/ https://www.ncbi.nlm.nih.gov/pubmed/33977077 http://dx.doi.org/10.1002/advs.202004958 |
work_keys_str_mv | AT xuhaodong deepvispdeeplearningforvirussiteintegrationpredictionandmotifdiscovery AT jiapeilin deepvispdeeplearningforvirussiteintegrationpredictionandmotifdiscovery AT zhaozhongming deepvispdeeplearningforvirussiteintegrationpredictionandmotifdiscovery |