Cargando…

Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks

Computational approaches to generate hypotheses from biomedical literature have been studied intensively in recent years. Nevertheless, it still remains a challenge to automatically discover novel, cross-silo biomedical hypotheses from large-scale literature repositories. In order to address this ch...

Descripción completa

Detalles Bibliográficos
Autores principales:	Katukuri, Jayasimha Reddy, Xie, Ying, Raghavan, Vijay V, Gupta, Ashish
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3394427/ https://www.ncbi.nlm.nih.gov/pubmed/22759614 http://dx.doi.org/10.1186/1471-2164-13-S3-S5

_version_	1782237867836178432
author	Katukuri, Jayasimha Reddy Xie, Ying Raghavan, Vijay V Gupta, Ashish
author_facet	Katukuri, Jayasimha Reddy Xie, Ying Raghavan, Vijay V Gupta, Ashish
author_sort	Katukuri, Jayasimha Reddy
collection	PubMed
description	Computational approaches to generate hypotheses from biomedical literature have been studied intensively in recent years. Nevertheless, it still remains a challenge to automatically discover novel, cross-silo biomedical hypotheses from large-scale literature repositories. In order to address this challenge, we first model a biomedical literature repository as a comprehensive network of biomedical concepts and formulate hypotheses generation as a process of link discovery on the concept network. We extract the relevant information from the biomedical literature corpus and generate a concept network and concept-author map on a cluster using Map-Reduce frame-work. We extract a set of heterogeneous features such as random walk based features, neighborhood features and common author features. The potential number of links to consider for the possibility of link discovery is large in our concept network and to address the scalability problem, the features from a concept network are extracted using a cluster with Map-Reduce framework. We further model link discovery as a classification problem carried out on a training data set automatically extracted from two network snapshots taken in two consecutive time duration. A set of heterogeneous features, which cover both topological and semantic features derived from the concept network, have been studied with respect to their impacts on the accuracy of the proposed supervised link discovery process. A case study of hypotheses generation based on the proposed method has been presented in the paper.
format	Online Article Text
id	pubmed-3394427
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-33944272012-07-16 Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks Katukuri, Jayasimha Reddy Xie, Ying Raghavan, Vijay V Gupta, Ashish BMC Genomics Proceedings Computational approaches to generate hypotheses from biomedical literature have been studied intensively in recent years. Nevertheless, it still remains a challenge to automatically discover novel, cross-silo biomedical hypotheses from large-scale literature repositories. In order to address this challenge, we first model a biomedical literature repository as a comprehensive network of biomedical concepts and formulate hypotheses generation as a process of link discovery on the concept network. We extract the relevant information from the biomedical literature corpus and generate a concept network and concept-author map on a cluster using Map-Reduce frame-work. We extract a set of heterogeneous features such as random walk based features, neighborhood features and common author features. The potential number of links to consider for the possibility of link discovery is large in our concept network and to address the scalability problem, the features from a concept network are extracted using a cluster with Map-Reduce framework. We further model link discovery as a classification problem carried out on a training data set automatically extracted from two network snapshots taken in two consecutive time duration. A set of heterogeneous features, which cover both topological and semantic features derived from the concept network, have been studied with respect to their impacts on the accuracy of the proposed supervised link discovery process. A case study of hypotheses generation based on the proposed method has been presented in the paper. BioMed Central 2012-06-11 /pmc/articles/PMC3394427/ /pubmed/22759614 http://dx.doi.org/10.1186/1471-2164-13-S3-S5 Text en Copyright ©2012 Katukuri et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Katukuri, Jayasimha Reddy Xie, Ying Raghavan, Vijay V Gupta, Ashish Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title	Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title_full	Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title_fullStr	Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title_full_unstemmed	Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title_short	Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
title_sort	hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3394427/ https://www.ncbi.nlm.nih.gov/pubmed/22759614 http://dx.doi.org/10.1186/1471-2164-13-S3-S5
work_keys_str_mv	AT katukurijayasimhareddy hypothesesgenerationassupervisedlinkdiscoverywithautomatedclasslabelingonlargescalebiomedicalconceptnetworks AT xieying hypothesesgenerationassupervisedlinkdiscoverywithautomatedclasslabelingonlargescalebiomedicalconceptnetworks AT raghavanvijayv hypothesesgenerationassupervisedlinkdiscoverywithautomatedclasslabelingonlargescalebiomedicalconceptnetworks AT guptaashish hypothesesgenerationassupervisedlinkdiscoverywithautomatedclasslabelingonlargescalebiomedicalconceptnetworks

Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks

Ejemplares similares