Cargando…

An analysis on the entity annotations in biological corpora

Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotatio...

Descripción completa

Detalles Bibliográficos
Autor principal: Neves, Mariana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168744/
https://www.ncbi.nlm.nih.gov/pubmed/25254099
http://dx.doi.org/10.12688/f1000research.3216.1
_version_ 1782335610058440704
author Neves, Mariana
author_facet Neves, Mariana
author_sort Neves, Mariana
collection PubMed
description Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.
format Online
Article
Text
id pubmed-4168744
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-41687442014-09-23 An analysis on the entity annotations in biological corpora Neves, Mariana F1000Res Review Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain. F1000Research 2014-04-25 /pmc/articles/PMC4168744/ /pubmed/25254099 http://dx.doi.org/10.12688/f1000research.3216.1 Text en Copyright: © 2014 Neves M http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/publicdomain/zero/1.0/ Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
spellingShingle Review
Neves, Mariana
An analysis on the entity annotations in biological corpora
title An analysis on the entity annotations in biological corpora
title_full An analysis on the entity annotations in biological corpora
title_fullStr An analysis on the entity annotations in biological corpora
title_full_unstemmed An analysis on the entity annotations in biological corpora
title_short An analysis on the entity annotations in biological corpora
title_sort analysis on the entity annotations in biological corpora
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168744/
https://www.ncbi.nlm.nih.gov/pubmed/25254099
http://dx.doi.org/10.12688/f1000research.3216.1
work_keys_str_mv AT nevesmariana ananalysisontheentityannotationsinbiologicalcorpora
AT nevesmariana analysisontheentityannotationsinbiologicalcorpora