Cargando…

Integration of open access literature into the RCSB Protein Data Bank using BioLit

BACKGROUND: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, oft...

Descripción completa

Detalles Bibliográficos
Autores principales: Prlić, Andreas, Martinez, Marco A, Dimitropoulos, Dimitris, Beran, Bojan, Yukich, Benjamin T, Rose, Peter W, Bourne, Philip E, Fink, J Lynn
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2880030/
https://www.ncbi.nlm.nih.gov/pubmed/20429930
http://dx.doi.org/10.1186/1471-2105-11-220
_version_ 1782181987375644672
author Prlić, Andreas
Martinez, Marco A
Dimitropoulos, Dimitris
Beran, Bojan
Yukich, Benjamin T
Rose, Peter W
Bourne, Philip E
Fink, J Lynn
author_facet Prlić, Andreas
Martinez, Marco A
Dimitropoulos, Dimitris
Beran, Bojan
Yukich, Benjamin T
Rose, Peter W
Bourne, Philip E
Fink, J Lynn
author_sort Prlić, Andreas
collection PubMed
description BACKGROUND: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB). RESULTS: BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the results as simple XML Web Services. Here, we integrate BioLit results with the RCSB PDB website by using these services to find PDB IDs that are mentioned in research articles and subsequently retrieving abstract, figures, and text excerpts for those articles. A new RCSB PDB literature view permits browsing through the figures and abstracts of the articles that mention a given structure. The BioLit Web Services that are providing the underlying data are publicly accessible. A client library is provided that supports querying these services (Java). CONCLUSIONS: The integration between literature and websites, as demonstrated here with the RCSB PDB, provides a broader view for how a given structure has been analyzed and used. This approach detects the mention of a PDB structure even if it is not formally cited in the paper. Other structures related through the same literature references can also be identified, possibly providing new scientific insight. To our knowledge this is the first time that database and literature have been integrated in this way and it speaks to the opportunities afforded by open and free access to both database and literature content.
format Text
id pubmed-2880030
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28800302010-06-03 Integration of open access literature into the RCSB Protein Data Bank using BioLit Prlić, Andreas Martinez, Marco A Dimitropoulos, Dimitris Beran, Bojan Yukich, Benjamin T Rose, Peter W Bourne, Philip E Fink, J Lynn BMC Bioinformatics Software BACKGROUND: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB). RESULTS: BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the results as simple XML Web Services. Here, we integrate BioLit results with the RCSB PDB website by using these services to find PDB IDs that are mentioned in research articles and subsequently retrieving abstract, figures, and text excerpts for those articles. A new RCSB PDB literature view permits browsing through the figures and abstracts of the articles that mention a given structure. The BioLit Web Services that are providing the underlying data are publicly accessible. A client library is provided that supports querying these services (Java). CONCLUSIONS: The integration between literature and websites, as demonstrated here with the RCSB PDB, provides a broader view for how a given structure has been analyzed and used. This approach detects the mention of a PDB structure even if it is not formally cited in the paper. Other structures related through the same literature references can also be identified, possibly providing new scientific insight. To our knowledge this is the first time that database and literature have been integrated in this way and it speaks to the opportunities afforded by open and free access to both database and literature content. BioMed Central 2010-04-29 /pmc/articles/PMC2880030/ /pubmed/20429930 http://dx.doi.org/10.1186/1471-2105-11-220 Text en Copyright ©2010 Prlić et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Prlić, Andreas
Martinez, Marco A
Dimitropoulos, Dimitris
Beran, Bojan
Yukich, Benjamin T
Rose, Peter W
Bourne, Philip E
Fink, J Lynn
Integration of open access literature into the RCSB Protein Data Bank using BioLit
title Integration of open access literature into the RCSB Protein Data Bank using BioLit
title_full Integration of open access literature into the RCSB Protein Data Bank using BioLit
title_fullStr Integration of open access literature into the RCSB Protein Data Bank using BioLit
title_full_unstemmed Integration of open access literature into the RCSB Protein Data Bank using BioLit
title_short Integration of open access literature into the RCSB Protein Data Bank using BioLit
title_sort integration of open access literature into the rcsb protein data bank using biolit
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2880030/
https://www.ncbi.nlm.nih.gov/pubmed/20429930
http://dx.doi.org/10.1186/1471-2105-11-220
work_keys_str_mv AT prlicandreas integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT martinezmarcoa integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT dimitropoulosdimitris integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT beranbojan integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT yukichbenjamint integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT rosepeterw integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT bournephilipe integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit
AT finkjlynn integrationofopenaccessliteratureintothercsbproteindatabankusingbiolit