Cargando…

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature

Abstract. BACKGROUND: A considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-r...

Descripción completa

Detalles Bibliográficos
Autores principales: Muñoz, Gabriel, Kissling, W. Daniel, van Loon, E. Emiel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Pensoft Publishers 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344444/
https://www.ncbi.nlm.nih.gov/pubmed/30692868
http://dx.doi.org/10.3897/BDJ.7.e28737
_version_ 1783389426644156416
author Muñoz, Gabriel
Kissling, W. Daniel
van Loon, E. Emiel
author_facet Muñoz, Gabriel
Kissling, W. Daniel
van Loon, E. Emiel
author_sort Muñoz, Gabriel
collection PubMed
description Abstract. BACKGROUND: A considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-readable. Nonetheless, the amount and diversity of digitally published literature pose many challenges for knowledge discovery and retrieval. Text mining has been extensively used for data discovery tasks in large quantities of documents. However, text mining approaches for knowledge discovery and retrieval have been limited in biodiversity science compared to other disciplines. NEW INFORMATION: Here, we present a novel, open source text mining tool, the Biodiversity Observations Miner (BOM). This web application, written in R, allows the semi-automated discovery of punctual biodiversity observations (e.g. biotic interactions, functional or behavioural traits and natural history descriptions) associated with the scientific names present inside a corpus of scientific literature. Furthermore, BOM enable users the rapid screening of large quantities of literature based on word co-occurrences that match custom biodiversity dictionaries. This tool aims to increase the digital mobilisation of primary biodiversity data and is freely accessible via GitHub or through a web server.
format Online
Article
Text
id pubmed-6344444
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Pensoft Publishers
record_format MEDLINE/PubMed
spelling pubmed-63444442019-01-28 Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature Muñoz, Gabriel Kissling, W. Daniel van Loon, E. Emiel Biodivers Data J Software Description Abstract. BACKGROUND: A considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-readable. Nonetheless, the amount and diversity of digitally published literature pose many challenges for knowledge discovery and retrieval. Text mining has been extensively used for data discovery tasks in large quantities of documents. However, text mining approaches for knowledge discovery and retrieval have been limited in biodiversity science compared to other disciplines. NEW INFORMATION: Here, we present a novel, open source text mining tool, the Biodiversity Observations Miner (BOM). This web application, written in R, allows the semi-automated discovery of punctual biodiversity observations (e.g. biotic interactions, functional or behavioural traits and natural history descriptions) associated with the scientific names present inside a corpus of scientific literature. Furthermore, BOM enable users the rapid screening of large quantities of literature based on word co-occurrences that match custom biodiversity dictionaries. This tool aims to increase the digital mobilisation of primary biodiversity data and is freely accessible via GitHub or through a web server. Pensoft Publishers 2019-01-16 /pmc/articles/PMC6344444/ /pubmed/30692868 http://dx.doi.org/10.3897/BDJ.7.e28737 Text en Gabriel Muñoz, W. Daniel Kissling, E. Emiel van Loon http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Software Description
Muñoz, Gabriel
Kissling, W. Daniel
van Loon, E. Emiel
Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title_full Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title_fullStr Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title_full_unstemmed Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title_short Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature
title_sort biodiversity observations miner: a web application to unlock primary biodiversity data from published literature
topic Software Description
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6344444/
https://www.ncbi.nlm.nih.gov/pubmed/30692868
http://dx.doi.org/10.3897/BDJ.7.e28737
work_keys_str_mv AT munozgabriel biodiversityobservationsminerawebapplicationtounlockprimarybiodiversitydatafrompublishedliterature
AT kisslingwdaniel biodiversityobservationsminerawebapplicationtounlockprimarybiodiversitydatafrompublishedliterature
AT vanlooneemiel biodiversityobservationsminerawebapplicationtounlockprimarybiodiversitydatafrompublishedliterature