Cargando…
Extraction of chemical-induced diseases using prior knowledge and textual information
We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER sub...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4831722/ https://www.ncbi.nlm.nih.gov/pubmed/27081155 http://dx.doi.org/10.1093/database/baw046 |
_version_ | 1782427119894134784 |
---|---|
author | Pons, Ewoud Becker, Benedikt F.H. Akhondi, Saber A. Afzal, Zubair van Mulligen, Erik M. Kors, Jan A. |
author_facet | Pons, Ewoud Becker, Benedikt F.H. Akhondi, Saber A. Afzal, Zubair van Mulligen, Erik M. Kors, Jan A. |
author_sort | Pons, Ewoud |
collection | PubMed |
description | We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID). RELigator is available as a web service at http://biosemantics.org/index.php/software/religator. |
format | Online Article Text |
id | pubmed-4831722 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-48317222016-04-18 Extraction of chemical-induced diseases using prior knowledge and textual information Pons, Ewoud Becker, Benedikt F.H. Akhondi, Saber A. Afzal, Zubair van Mulligen, Erik M. Kors, Jan A. Database (Oxford) Original Article We describe our approach to the chemical–disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID). RELigator is available as a web service at http://biosemantics.org/index.php/software/religator. Oxford University Press 2016-04-14 /pmc/articles/PMC4831722/ /pubmed/27081155 http://dx.doi.org/10.1093/database/baw046 Text en © The Author(s) 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Pons, Ewoud Becker, Benedikt F.H. Akhondi, Saber A. Afzal, Zubair van Mulligen, Erik M. Kors, Jan A. Extraction of chemical-induced diseases using prior knowledge and textual information |
title | Extraction of chemical-induced diseases using prior knowledge and textual information |
title_full | Extraction of chemical-induced diseases using prior knowledge and textual information |
title_fullStr | Extraction of chemical-induced diseases using prior knowledge and textual information |
title_full_unstemmed | Extraction of chemical-induced diseases using prior knowledge and textual information |
title_short | Extraction of chemical-induced diseases using prior knowledge and textual information |
title_sort | extraction of chemical-induced diseases using prior knowledge and textual information |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4831722/ https://www.ncbi.nlm.nih.gov/pubmed/27081155 http://dx.doi.org/10.1093/database/baw046 |
work_keys_str_mv | AT ponsewoud extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation AT beckerbenediktfh extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation AT akhondisabera extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation AT afzalzubair extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation AT vanmulligenerikm extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation AT korsjana extractionofchemicalinduceddiseasesusingpriorknowledgeandtextualinformation |