Cargando…

Analyzing a co-occurrence gene-interaction network to identify disease-gene association

BACKGROUND: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs n...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-Aamri, Amira, Taha, Kamal, Al-Hammadi, Yousof, Maalouf, Maher, Homouz, Dirar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6368766/
https://www.ncbi.nlm.nih.gov/pubmed/30736752
http://dx.doi.org/10.1186/s12859-019-2634-7
_version_ 1783394056774090752
author Al-Aamri, Amira
Taha, Kamal
Al-Hammadi, Yousof
Maalouf, Maher
Homouz, Dirar
author_facet Al-Aamri, Amira
Taha, Kamal
Al-Hammadi, Yousof
Maalouf, Maher
Homouz, Dirar
author_sort Al-Aamri, Amira
collection PubMed
description BACKGROUND: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes. RESULTS: We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes. CONCLUSIONS: The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2634-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6368766
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63687662019-02-15 Analyzing a co-occurrence gene-interaction network to identify disease-gene association Al-Aamri, Amira Taha, Kamal Al-Hammadi, Yousof Maalouf, Maher Homouz, Dirar BMC Bioinformatics Methodology Article BACKGROUND: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes. RESULTS: We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes. CONCLUSIONS: The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2634-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-08 /pmc/articles/PMC6368766/ /pubmed/30736752 http://dx.doi.org/10.1186/s12859-019-2634-7 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Al-Aamri, Amira
Taha, Kamal
Al-Hammadi, Yousof
Maalouf, Maher
Homouz, Dirar
Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title_full Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title_fullStr Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title_full_unstemmed Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title_short Analyzing a co-occurrence gene-interaction network to identify disease-gene association
title_sort analyzing a co-occurrence gene-interaction network to identify disease-gene association
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6368766/
https://www.ncbi.nlm.nih.gov/pubmed/30736752
http://dx.doi.org/10.1186/s12859-019-2634-7
work_keys_str_mv AT alaamriamira analyzingacooccurrencegeneinteractionnetworktoidentifydiseasegeneassociation
AT tahakamal analyzingacooccurrencegeneinteractionnetworktoidentifydiseasegeneassociation
AT alhammadiyousof analyzingacooccurrencegeneinteractionnetworktoidentifydiseasegeneassociation
AT maaloufmaher analyzingacooccurrencegeneinteractionnetworktoidentifydiseasegeneassociation
AT homouzdirar analyzingacooccurrencegeneinteractionnetworktoidentifydiseasegeneassociation