Cargando…

ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

Summary: The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are...

Descripción completa

Detalles Bibliográficos
Autores principales: Pafilis, Evangelos, Frankild, Sune P., Schnetzer, Julia, Fanini, Lucia, Faulwetter, Sarah, Pavloudi, Christina, Vasileiadou, Katerina, Leary, Patrick, Hammock, Jennifer, Schulz, Katja, Parr, Cynthia Sims, Arvanitidis, Christos, Jensen, Lars Juhl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4443677/
https://www.ncbi.nlm.nih.gov/pubmed/25619994
http://dx.doi.org/10.1093/bioinformatics/btv045
_version_ 1782373037082935296
author Pafilis, Evangelos
Frankild, Sune P.
Schnetzer, Julia
Fanini, Lucia
Faulwetter, Sarah
Pavloudi, Christina
Vasileiadou, Katerina
Leary, Patrick
Hammock, Jennifer
Schulz, Katja
Parr, Cynthia Sims
Arvanitidis, Christos
Jensen, Lars Juhl
author_facet Pafilis, Evangelos
Frankild, Sune P.
Schnetzer, Julia
Fanini, Lucia
Faulwetter, Sarah
Pavloudi, Christina
Vasileiadou, Katerina
Leary, Patrick
Hammock, Jennifer
Schulz, Katja
Parr, Cynthia Sims
Arvanitidis, Christos
Jensen, Lars Juhl
author_sort Pafilis, Evangelos
collection PubMed
description Summary: The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users. Availability and implementation: The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at http://environments.hcmr.gr Contact: pafilis@hcmr.gr or lars.juhl.jensen@cpr.ku.dk Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4443677
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-44436772015-06-05 ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life Pafilis, Evangelos Frankild, Sune P. Schnetzer, Julia Fanini, Lucia Faulwetter, Sarah Pavloudi, Christina Vasileiadou, Katerina Leary, Patrick Hammock, Jennifer Schulz, Katja Parr, Cynthia Sims Arvanitidis, Christos Jensen, Lars Juhl Bioinformatics Applications Notes Summary: The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users. Availability and implementation: The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at http://environments.hcmr.gr Contact: pafilis@hcmr.gr or lars.juhl.jensen@cpr.ku.dk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2015-06-01 2015-01-24 /pmc/articles/PMC4443677/ /pubmed/25619994 http://dx.doi.org/10.1093/bioinformatics/btv045 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Pafilis, Evangelos
Frankild, Sune P.
Schnetzer, Julia
Fanini, Lucia
Faulwetter, Sarah
Pavloudi, Christina
Vasileiadou, Katerina
Leary, Patrick
Hammock, Jennifer
Schulz, Katja
Parr, Cynthia Sims
Arvanitidis, Christos
Jensen, Lars Juhl
ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title_full ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title_fullStr ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title_full_unstemmed ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title_short ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life
title_sort environments and eol: identification of environment ontology terms in text and the annotation of the encyclopedia of life
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4443677/
https://www.ncbi.nlm.nih.gov/pubmed/25619994
http://dx.doi.org/10.1093/bioinformatics/btv045
work_keys_str_mv AT pafilisevangelos environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT frankildsunep environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT schnetzerjulia environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT faninilucia environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT faulwettersarah environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT pavloudichristina environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT vasileiadoukaterina environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT learypatrick environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT hammockjennifer environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT schulzkatja environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT parrcynthiasims environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT arvanitidischristos environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife
AT jensenlarsjuhl environmentsandeolidentificationofenvironmentontologytermsintextandtheannotationoftheencyclopediaoflife