Cargando…

m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network

BACKGROUND: Recent studies have confirmed that N7-methylguanosine (m(7)G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cost and time ineffective for the identification of disease-associated m(7)G...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Jiani, Zhang, Lin, Chen, Jin, Song, Bowen, Zang, Chenxuan, Liu, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7992861/
https://www.ncbi.nlm.nih.gov/pubmed/33761868
http://dx.doi.org/10.1186/s12859-021-04007-9
_version_ 1783669467307311104
author Ma, Jiani
Zhang, Lin
Chen, Jin
Song, Bowen
Zang, Chenxuan
Liu, Hui
author_facet Ma, Jiani
Zhang, Lin
Chen, Jin
Song, Bowen
Zang, Chenxuan
Liu, Hui
author_sort Ma, Jiani
collection PubMed
description BACKGROUND: Recent studies have confirmed that N7-methylguanosine (m(7)G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cost and time ineffective for the identification of disease-associated m(7)G sites. To date, tens of thousands of m(7)G sites have been identified by high-throughput sequencing approaches and the information is publicly available in bioinformatics databases, which can be leveraged to predict potential disease-associated m(7)G sites using a computational perspective. Thus, computational methods for m(7)G-disease association prediction are urgently needed, but none are currently available at present. RESULTS: To fill this gap, we collected association information between m(7)G sites and diseases, genomic information of m(7)G sites, and phenotypic information of diseases from different databases to build an m(7)G-disease association dataset. To infer potential disease-associated m(7)G sites, we then proposed a heterogeneous network-based model, m(7)G Sites and Diseases Associations Inference (m(7)GDisAI) model. m(7)GDisAI predicts the potential disease-associated m(7)G sites by applying a matrix decomposition method on heterogeneous networks which integrate comprehensive similarity information of m(7)G sites and diseases. To evaluate the prediction performance, 10 runs of tenfold cross validation were first conducted, and m(7)GDisAI got the highest AUC of 0.740(± 0.0024). Then global and local leave-one-out cross validation (LOOCV) experiments were implemented to evaluate the model’s accuracy in global and local situations respectively. AUC of 0.769 was achieved in global LOOCV, while 0.635 in local LOOCV. A case study was finally conducted to identify the most promising ovarian cancer-related m(7)G sites for further functional analysis. Gene Ontology (GO) enrichment analysis was performed to explore the complex associations between host gene of m(7)G sites and GO terms. The results showed that m(7)GDisAI identified disease-associated m(7)G sites and their host genes are consistently related to the pathogenesis of ovarian cancer, which may provide some clues for pathogenesis of diseases. CONCLUSION: The m(7)GDisAI web server can be accessed at http://180.208.58.66/m7GDisAI/, which provides a user-friendly interface to query disease associated m(7)G. The list of top 20 m(7)G sites predicted to be associted with 177 diseases can be achieved. Furthermore, detailed information about specific m(7)G sites and diseases are also shown. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04007-9.
format Online
Article
Text
id pubmed-7992861
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79928612021-03-25 m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network Ma, Jiani Zhang, Lin Chen, Jin Song, Bowen Zang, Chenxuan Liu, Hui BMC Bioinformatics Software BACKGROUND: Recent studies have confirmed that N7-methylguanosine (m(7)G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cost and time ineffective for the identification of disease-associated m(7)G sites. To date, tens of thousands of m(7)G sites have been identified by high-throughput sequencing approaches and the information is publicly available in bioinformatics databases, which can be leveraged to predict potential disease-associated m(7)G sites using a computational perspective. Thus, computational methods for m(7)G-disease association prediction are urgently needed, but none are currently available at present. RESULTS: To fill this gap, we collected association information between m(7)G sites and diseases, genomic information of m(7)G sites, and phenotypic information of diseases from different databases to build an m(7)G-disease association dataset. To infer potential disease-associated m(7)G sites, we then proposed a heterogeneous network-based model, m(7)G Sites and Diseases Associations Inference (m(7)GDisAI) model. m(7)GDisAI predicts the potential disease-associated m(7)G sites by applying a matrix decomposition method on heterogeneous networks which integrate comprehensive similarity information of m(7)G sites and diseases. To evaluate the prediction performance, 10 runs of tenfold cross validation were first conducted, and m(7)GDisAI got the highest AUC of 0.740(± 0.0024). Then global and local leave-one-out cross validation (LOOCV) experiments were implemented to evaluate the model’s accuracy in global and local situations respectively. AUC of 0.769 was achieved in global LOOCV, while 0.635 in local LOOCV. A case study was finally conducted to identify the most promising ovarian cancer-related m(7)G sites for further functional analysis. Gene Ontology (GO) enrichment analysis was performed to explore the complex associations between host gene of m(7)G sites and GO terms. The results showed that m(7)GDisAI identified disease-associated m(7)G sites and their host genes are consistently related to the pathogenesis of ovarian cancer, which may provide some clues for pathogenesis of diseases. CONCLUSION: The m(7)GDisAI web server can be accessed at http://180.208.58.66/m7GDisAI/, which provides a user-friendly interface to query disease associated m(7)G. The list of top 20 m(7)G sites predicted to be associted with 177 diseases can be achieved. Furthermore, detailed information about specific m(7)G sites and diseases are also shown. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04007-9. BioMed Central 2021-03-24 /pmc/articles/PMC7992861/ /pubmed/33761868 http://dx.doi.org/10.1186/s12859-021-04007-9 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Ma, Jiani
Zhang, Lin
Chen, Jin
Song, Bowen
Zang, Chenxuan
Liu, Hui
m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title_full m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title_fullStr m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title_full_unstemmed m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title_short m(7)GDisAI: N7-methylguanosine (m(7)G) sites and diseases associations inference based on heterogeneous network
title_sort m(7)gdisai: n7-methylguanosine (m(7)g) sites and diseases associations inference based on heterogeneous network
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7992861/
https://www.ncbi.nlm.nih.gov/pubmed/33761868
http://dx.doi.org/10.1186/s12859-021-04007-9
work_keys_str_mv AT majiani m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork
AT zhanglin m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork
AT chenjin m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork
AT songbowen m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork
AT zangchenxuan m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork
AT liuhui m7gdisain7methylguanosinem7gsitesanddiseasesassociationsinferencebasedonheterogeneousnetwork