Cargando…

Extraction of chemical-induced diseases using prior knowledge and textual information

We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER sub...

Descripción completa

Detalles Bibliográficos
Autores principales: Pons, Ewoud, Becker, Benedikt F.H., Akhondi, Saber A., Afzal, Zubair, van Mulligen, Erik M., Kors, Jan A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4831722/
https://www.ncbi.nlm.nih.gov/pubmed/27081155
http://dx.doi.org/10.1093/database/baw046
_version_ 1782427119894134784
author Pons, Ewoud
Becker, Benedikt F.H.
Akhondi, Saber A.
Afzal, Zubair
van Mulligen, Erik M.
Kors, Jan A.
author_facet Pons, Ewoud
Becker, Benedikt F.H.
Akhondi, Saber A.
Afzal, Zubair
van Mulligen, Erik M.
Kors, Jan A.
author_sort Pons, Ewoud
collection PubMed
description We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID). RELigator is available as a web service at http://biosemantics.org/index.php/software/religator.
format Online
Article
Text
id pubmed-4831722
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-48317222016-04-18 Extraction of chemical-induced diseases using prior knowledge and textual information Pons, Ewoud Becker, Benedikt F.H. Akhondi, Saber A. Afzal, Zubair van Mulligen, Erik M. Kors, Jan A. Database (Oxford) Original Article We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID). RELigator is available as a web service at http://biosemantics.org/index.php/software/religator. Oxford University Press 2016-04-14 /pmc/articles/PMC4831722/ /pubmed/27081155 http://dx.doi.org/10.1093/database/baw046 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Pons, Ewoud
Becker, Benedikt F.H.
Akhondi, Saber A.
Afzal, Zubair
van Mulligen, Erik M.
Kors, Jan A.
Extraction of chemical-induced diseases using prior knowledge and textual information
title Extraction of chemical-induced diseases using prior knowledge and textual information
title_full Extraction of chemical-induced diseases using prior knowledge and textual information
title_fullStr Extraction of chemical-induced diseases using prior knowledge and textual information
title_full_unstemmed Extraction of chemical-induced diseases using prior knowledge and textual information
title_short Extraction of chemical-induced diseases using prior knowledge and textual information
title_sort extraction of chemical-induced diseases using prior knowledge and textual information
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4831722/
https://www.ncbi.nlm.nih.gov/pubmed/27081155
http://dx.doi.org/10.1093/database/baw046
work_keys_str_mv AT ponsewoud extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation
AT beckerbenediktfh extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation
AT akhondisabera extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation
AT afzalzubair extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation
AT vanmulligenerikm extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation
AT korsjana extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation