Cargando…

A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks

BACKGROUND: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually in...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiang, Zuoshuang, Qin, Tingting, Qin, Zhaohui S, He, Yongqun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852244/
https://www.ncbi.nlm.nih.gov/pubmed/24555475
http://dx.doi.org/10.1186/1752-0509-7-S3-S9
_version_ 1782478633005219840
author Xiang, Zuoshuang
Qin, Tingting
Qin, Zhaohui S
He, Yongqun
author_facet Xiang, Zuoshuang
Qin, Tingting
Qin, Zhaohui S
He, Yongqun
author_sort Xiang, Zuoshuang
collection PubMed
description BACKGROUND: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. RESULTS: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. CONCLUSIONS: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.
format Online
Article
Text
id pubmed-3852244
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38522442013-12-20 A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks Xiang, Zuoshuang Qin, Tingting Qin, Zhaohui S He, Yongqun BMC Syst Biol Research BACKGROUND: The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. RESULTS: The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. CONCLUSIONS: The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope. BioMed Central 2013-10-16 /pmc/articles/PMC3852244/ /pubmed/24555475 http://dx.doi.org/10.1186/1752-0509-7-S3-S9 Text en Copyright © 2013 Xiang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Xiang, Zuoshuang
Qin, Tingting
Qin, Zhaohui S
He, Yongqun
A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title_full A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title_fullStr A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title_full_unstemmed A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title_short A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
title_sort genome-wide mesh-based literature mining system predicts implicit gene-to-gene relationships and networks
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852244/
https://www.ncbi.nlm.nih.gov/pubmed/24555475
http://dx.doi.org/10.1186/1752-0509-7-S3-S9
work_keys_str_mv AT xiangzuoshuang agenomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT qintingting agenomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT qinzhaohuis agenomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT heyongqun agenomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT xiangzuoshuang genomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT qintingting genomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT qinzhaohuis genomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks
AT heyongqun genomewidemeshbasedliteratureminingsystempredictsimplicitgenetogenerelationshipsandnetworks