Cargando…

Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task

Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automat...

Descripción completa

Detalles Bibliográficos
Autores principales: Wei, Chih-Hsuan, Peng, Yifan, Leaman, Robert, Davis, Allan Peter, Mattingly, Carolyn J., Li, Jiao, Wiegers, Thomas C., Lu, Zhiyong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4799720/
https://www.ncbi.nlm.nih.gov/pubmed/26994911
http://dx.doi.org/10.1093/database/baw032
_version_ 1782422394657308672
author Wei, Chih-Hsuan
Peng, Yifan
Leaman, Robert
Davis, Allan Peter
Mattingly, Carolyn J.
Li, Jiao
Wiegers, Thomas C.
Lu, Zhiyong
author_facet Wei, Chih-Hsuan
Peng, Yifan
Leaman, Robert
Davis, Allan Peter
Mattingly, Carolyn J.
Li, Jiao
Wiegers, Thomas C.
Lu, Zhiyong
author_sort Wei, Chih-Hsuan
collection PubMed
description Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automatic chemical-disease relation (CDR) extraction. Despite these attempts, the lack of a comprehensive benchmarking dataset has limited the comparison of different techniques in order to assess and advance the current state-of-the-art. To this end, we organized a challenge task through BioCreative V to automatically extract CDRs from the literature. We designed two challenge tasks: disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. To assist system development and assessment, we created a large annotated text corpus that consisted of human annotations of chemicals, diseases and their interactions from 1500 PubMed articles. 34 teams worldwide participated in the CDR task: 16 (DNER) and 18 (CID). The best systems achieved an F-score of 86.46% for the DNER task—a result that approaches the human inter-annotator agreement (0.8875)—and an F-score of 57.03% for the CID task, the highest results ever reported for such tasks. When combining team results via machine learning, the ensemble system was able to further improve over the best team results by achieving 88.89% and 62.80% in F-score for the DNER and CID task, respectively. Additionally, another novel aspect of our evaluation is to test each participating system’s ability to return real-time results: the average response time for each team’s DNER and CID web service systems were 5.6 and 9.3 s, respectively. Most teams used hybrid systems for their submissions based on machining learning. Given the level of participation and results, we found our task to be successful in engaging the text-mining research community, producing a large annotated corpus and improving the results of automatic disease recognition and CDR extraction. Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/
format Online
Article
Text
id pubmed-4799720
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47997202016-03-21 Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task Wei, Chih-Hsuan Peng, Yifan Leaman, Robert Davis, Allan Peter Mattingly, Carolyn J. Li, Jiao Wiegers, Thomas C. Lu, Zhiyong Database (Oxford) Original Article Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automatic chemical-disease relation (CDR) extraction. Despite these attempts, the lack of a comprehensive benchmarking dataset has limited the comparison of different techniques in order to assess and advance the current state-of-the-art. To this end, we organized a challenge task through BioCreative V to automatically extract CDRs from the literature. We designed two challenge tasks: disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. To assist system development and assessment, we created a large annotated text corpus that consisted of human annotations of chemicals, diseases and their interactions from 1500 PubMed articles. 34 teams worldwide participated in the CDR task: 16 (DNER) and 18 (CID). The best systems achieved an F-score of 86.46% for the DNER task—a result that approaches the human inter-annotator agreement (0.8875)—and an F-score of 57.03% for the CID task, the highest results ever reported for such tasks. When combining team results via machine learning, the ensemble system was able to further improve over the best team results by achieving 88.89% and 62.80% in F-score for the DNER and CID task, respectively. Additionally, another novel aspect of our evaluation is to test each participating system’s ability to return real-time results: the average response time for each team’s DNER and CID web service systems were 5.6 and 9.3 s, respectively. Most teams used hybrid systems for their submissions based on machining learning. Given the level of participation and results, we found our task to be successful in engaging the text-mining research community, producing a large annotated corpus and improving the results of automatic disease recognition and CDR extraction. Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/ Oxford University Press 2016-03-19 /pmc/articles/PMC4799720/ /pubmed/26994911 http://dx.doi.org/10.1093/database/baw032 Text en Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
spellingShingle Original Article
Wei, Chih-Hsuan
Peng, Yifan
Leaman, Robert
Davis, Allan Peter
Mattingly, Carolyn J.
Li, Jiao
Wiegers, Thomas C.
Lu, Zhiyong
Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title_full Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title_fullStr Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title_full_unstemmed Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title_short Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task
title_sort assessing the state of the art in biomedical relation extraction: overview of the biocreative v chemical-disease relation (cdr) task
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4799720/
https://www.ncbi.nlm.nih.gov/pubmed/26994911
http://dx.doi.org/10.1093/database/baw032
work_keys_str_mv AT weichihhsuan assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT pengyifan assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT leamanrobert assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT davisallanpeter assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT mattinglycarolynj assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT lijiao assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT wiegersthomasc assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask
AT luzhiyong assessingthestateoftheartinbiomedicalrelationextractionoverviewofthebiocreativevchemicaldiseaserelationcdrtask