Cargando…

Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations

BACKGROUND: The construction of literature-based networks of gene-gene interactions is one of the most important applications of text mining in bioinformatics. Extracting potential gene relationships from the biomedical literature may be helpful in building biological hypotheses that can be explored...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hyunsoo, Park, Haesun, Drake, Barry L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217664/
https://www.ncbi.nlm.nih.gov/pubmed/18047707
http://dx.doi.org/10.1186/1471-2105-8-S9-S6
_version_ 1782149295904915456
author Kim, Hyunsoo
Park, Haesun
Drake, Barry L
author_facet Kim, Hyunsoo
Park, Haesun
Drake, Barry L
author_sort Kim, Hyunsoo
collection PubMed
description BACKGROUND: The construction of literature-based networks of gene-gene interactions is one of the most important applications of text mining in bioinformatics. Extracting potential gene relationships from the biomedical literature may be helpful in building biological hypotheses that can be explored further experimentally. Recently, latent semantic indexing based on the singular value decomposition (LSI/SVD) has been applied to gene retrieval. However, the determination of the number of factors k used in the reduced rank matrix is still an open problem. RESULTS: In this paper, we introduce a way to incorporate a priori knowledge of gene relationships into LSI/SVD to determine the number of factors. We also explore the utility of the non-negative matrix factorization (NMF) to extract unrecognized gene relationships from the biomedical literature by taking advantage of known gene relationships. A gene retrieval method based on NMF (GR/NMF) showed comparable performance with LSI/SVD. CONCLUSION: Using known gene relationships of a given gene, we can determine the number of factors used in the reduced rank matrix and retrieve unrecognized genes related with the given gene by LSI/SVD or GR/NMF.
format Text
id pubmed-2217664
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22176642008-01-31 Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations Kim, Hyunsoo Park, Haesun Drake, Barry L BMC Bioinformatics Proceedings BACKGROUND: The construction of literature-based networks of gene-gene interactions is one of the most important applications of text mining in bioinformatics. Extracting potential gene relationships from the biomedical literature may be helpful in building biological hypotheses that can be explored further experimentally. Recently, latent semantic indexing based on the singular value decomposition (LSI/SVD) has been applied to gene retrieval. However, the determination of the number of factors k used in the reduced rank matrix is still an open problem. RESULTS: In this paper, we introduce a way to incorporate a priori knowledge of gene relationships into LSI/SVD to determine the number of factors. We also explore the utility of the non-negative matrix factorization (NMF) to extract unrecognized gene relationships from the biomedical literature by taking advantage of known gene relationships. A gene retrieval method based on NMF (GR/NMF) showed comparable performance with LSI/SVD. CONCLUSION: Using known gene relationships of a given gene, we can determine the number of factors used in the reduced rank matrix and retrieve unrecognized genes related with the given gene by LSI/SVD or GR/NMF. BioMed Central 2007-11-27 /pmc/articles/PMC2217664/ /pubmed/18047707 http://dx.doi.org/10.1186/1471-2105-8-S9-S6 Text en Copyright © 2007 Kim et al; licensee BioMed Central Ltd.
spellingShingle Proceedings
Kim, Hyunsoo
Park, Haesun
Drake, Barry L
Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title_full Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title_fullStr Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title_full_unstemmed Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title_short Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
title_sort extracting unrecognized gene relationships from the biomedical literature via matrix factorizations
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217664/
https://www.ncbi.nlm.nih.gov/pubmed/18047707
http://dx.doi.org/10.1186/1471-2105-8-S9-S6
work_keys_str_mv AT kimhyunsoo extractingunrecognizedgenerelationshipsfromthebiomedicalliteratureviamatrixfactorizations
AT parkhaesun extractingunrecognizedgenerelationshipsfromthebiomedicalliteratureviamatrixfactorizations
AT drakebarryl extractingunrecognizedgenerelationshipsfromthebiomedicalliteratureviamatrixfactorizations