Cargando…

Computational Approaches for Predicting Biomedical Research Collaborations

Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological struc...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Qing, Yu, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222920/
https://www.ncbi.nlm.nih.gov/pubmed/25375164
http://dx.doi.org/10.1371/journal.pone.0111795
_version_ 1782343135586680832
author Zhang, Qing
Yu, Hong
author_facet Zhang, Qing
Yu, Hong
author_sort Zhang, Qing
collection PubMed
description Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naïve Bayes, naïve Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets.
format Online
Article
Text
id pubmed-4222920
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42229202014-11-13 Computational Approaches for Predicting Biomedical Research Collaborations Zhang, Qing Yu, Hong PLoS One Research Article Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naïve Bayes, naïve Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets. Public Library of Science 2014-11-06 /pmc/articles/PMC4222920/ /pubmed/25375164 http://dx.doi.org/10.1371/journal.pone.0111795 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Zhang, Qing
Yu, Hong
Computational Approaches for Predicting Biomedical Research Collaborations
title Computational Approaches for Predicting Biomedical Research Collaborations
title_full Computational Approaches for Predicting Biomedical Research Collaborations
title_fullStr Computational Approaches for Predicting Biomedical Research Collaborations
title_full_unstemmed Computational Approaches for Predicting Biomedical Research Collaborations
title_short Computational Approaches for Predicting Biomedical Research Collaborations
title_sort computational approaches for predicting biomedical research collaborations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222920/
https://www.ncbi.nlm.nih.gov/pubmed/25375164
http://dx.doi.org/10.1371/journal.pone.0111795
work_keys_str_mv AT zhangqing computationalapproachesforpredictingbiomedicalresearchcollaborations
AT yuhong computationalapproachesforpredictingbiomedicalresearchcollaborations