Cargando…
An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology
BACKGROUND: Identifying relevant research in an ever-growing body of published literature is becoming increasingly difficult. Establishing domain-specific knowledge bases may be a more effective and efficient way to manage and query information within specific biomedical fields. Adopting controlled...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248211/ https://www.ncbi.nlm.nih.gov/pubmed/17996092 http://dx.doi.org/10.1186/1471-2105-8-436 |
_version_ | 1782150981374443520 |
---|---|
author | Yu, Wei Yesupriya, Ajay Wulf, Anja Qu, Junfeng Khoury, Muin J Gwinn, Marta |
author_facet | Yu, Wei Yesupriya, Ajay Wulf, Anja Qu, Junfeng Khoury, Muin J Gwinn, Marta |
author_sort | Yu, Wei |
collection | PubMed |
description | BACKGROUND: Identifying relevant research in an ever-growing body of published literature is becoming increasingly difficult. Establishing domain-specific knowledge bases may be a more effective and efficient way to manage and query information within specific biomedical fields. Adopting controlled vocabulary is a critical step toward data integration and interoperability in any information system. We present an open source infrastructure that provides a powerful capacity for managing and mining data within a domain-specific knowledge base. As a practical application of our infrastructure, we presented two applications – Literature Finder and Investigator Browser – as well as a tool set for automating the data curating process for the human genome published literature database. The design of this infrastructure makes the system potentially extensible to other data sources. RESULTS: Information retrieval and usability tests demonstrated that the system had high rates of recall and precision, 90% and 93% respectively. The system was easy to learn, easy to use, reasonably speedy and effective. CONCLUSION: The open source system infrastructure presented in this paper provides a novel approach to managing and querying information and knowledge from domain-specific PubMed data. Using the controlled vocabulary UMLS enhanced data integration and interoperability and the extensibility of the system. In addition, by using MVC-based design and Java as a platform-independent programming language, this system provides a potential infrastructure for any domain-specific knowledge base in the biomedical field. |
format | Text |
id | pubmed-2248211 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-22482112008-02-20 An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology Yu, Wei Yesupriya, Ajay Wulf, Anja Qu, Junfeng Khoury, Muin J Gwinn, Marta BMC Bioinformatics Methodology Article BACKGROUND: Identifying relevant research in an ever-growing body of published literature is becoming increasingly difficult. Establishing domain-specific knowledge bases may be a more effective and efficient way to manage and query information within specific biomedical fields. Adopting controlled vocabulary is a critical step toward data integration and interoperability in any information system. We present an open source infrastructure that provides a powerful capacity for managing and mining data within a domain-specific knowledge base. As a practical application of our infrastructure, we presented two applications – Literature Finder and Investigator Browser – as well as a tool set for automating the data curating process for the human genome published literature database. The design of this infrastructure makes the system potentially extensible to other data sources. RESULTS: Information retrieval and usability tests demonstrated that the system had high rates of recall and precision, 90% and 93% respectively. The system was easy to learn, easy to use, reasonably speedy and effective. CONCLUSION: The open source system infrastructure presented in this paper provides a novel approach to managing and querying information and knowledge from domain-specific PubMed data. Using the controlled vocabulary UMLS enhanced data integration and interoperability and the extensibility of the system. In addition, by using MVC-based design and Java as a platform-independent programming language, this system provides a potential infrastructure for any domain-specific knowledge base in the biomedical field. BioMed Central 2007-11-09 /pmc/articles/PMC2248211/ /pubmed/17996092 http://dx.doi.org/10.1186/1471-2105-8-436 Text en Copyright © 2007 Yu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Yu, Wei Yesupriya, Ajay Wulf, Anja Qu, Junfeng Khoury, Muin J Gwinn, Marta An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title | An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title_full | An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title_fullStr | An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title_full_unstemmed | An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title_short | An open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of PubMed, with an example from human genome epidemiology |
title_sort | open source infrastructure for managing knowledge and finding potential collaborators in a domain-specific subset of pubmed, with an example from human genome epidemiology |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248211/ https://www.ncbi.nlm.nih.gov/pubmed/17996092 http://dx.doi.org/10.1186/1471-2105-8-436 |
work_keys_str_mv | AT yuwei anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT yesupriyaajay anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT wulfanja anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT qujunfeng anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT khourymuinj anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT gwinnmarta anopensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT yuwei opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT yesupriyaajay opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT wulfanja opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT qujunfeng opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT khourymuinj opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology AT gwinnmarta opensourceinfrastructureformanagingknowledgeandfindingpotentialcollaboratorsinadomainspecificsubsetofpubmedwithanexamplefromhumangenomeepidemiology |