Cargando…
Mining clinical relationships from patient narratives
BACKGROUND: The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this syste...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586752/ https://www.ncbi.nlm.nih.gov/pubmed/19025689 http://dx.doi.org/10.1186/1471-2105-9-S11-S3 |
_version_ | 1782160908383944704 |
---|---|
author | Roberts, Angus Gaizauskas, Robert Hepple, Mark Guo, Yikun |
author_facet | Roberts, Angus Gaizauskas, Robert Hepple, Mark Guo, Yikun |
author_sort | Roberts, Angus |
collection | PubMed |
description | BACKGROUND: The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning (ML) approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to the extraction of clinical relationships. RESULTS: We have designed and implemented an ML-based system for relation extraction, using support vector machines, and trained and tested it on a corpus of oncology narratives hand-annotated with clinically important relationships. Over a class of seven relation types, the system achieves an average F1 score of 72%, only slightly behind an indicative measure of human inter annotator agreement on the same task. We investigate the effectiveness of different features for this task, how extraction performance varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships. CONCLUSION: We have shown that it is possible to extract important clinical relationships from text, using supervised statistical ML techniques, at levels of accuracy approaching those of human annotators. Given the importance of relation extraction as an enabling technology for text mining and given also the ready adaptability of systems based on our supervised learning approach to other clinical relationship extraction tasks, this result has significance for clinical text mining more generally, though further work to confirm our encouraging results should be carried out on a larger sample of narratives and relationship types. |
format | Text |
id | pubmed-2586752 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25867522008-11-26 Mining clinical relationships from patient narratives Roberts, Angus Gaizauskas, Robert Hepple, Mark Guo, Yikun BMC Bioinformatics Research BACKGROUND: The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning (ML) approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to the extraction of clinical relationships. RESULTS: We have designed and implemented an ML-based system for relation extraction, using support vector machines, and trained and tested it on a corpus of oncology narratives hand-annotated with clinically important relationships. Over a class of seven relation types, the system achieves an average F1 score of 72%, only slightly behind an indicative measure of human inter annotator agreement on the same task. We investigate the effectiveness of different features for this task, how extraction performance varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships. CONCLUSION: We have shown that it is possible to extract important clinical relationships from text, using supervised statistical ML techniques, at levels of accuracy approaching those of human annotators. Given the importance of relation extraction as an enabling technology for text mining and given also the ready adaptability of systems based on our supervised learning approach to other clinical relationship extraction tasks, this result has significance for clinical text mining more generally, though further work to confirm our encouraging results should be carried out on a larger sample of narratives and relationship types. BioMed Central 2008-11-19 /pmc/articles/PMC2586752/ /pubmed/19025689 http://dx.doi.org/10.1186/1471-2105-9-S11-S3 Text en Copyright © 2008 Roberts et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Roberts, Angus Gaizauskas, Robert Hepple, Mark Guo, Yikun Mining clinical relationships from patient narratives |
title | Mining clinical relationships from patient narratives |
title_full | Mining clinical relationships from patient narratives |
title_fullStr | Mining clinical relationships from patient narratives |
title_full_unstemmed | Mining clinical relationships from patient narratives |
title_short | Mining clinical relationships from patient narratives |
title_sort | mining clinical relationships from patient narratives |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586752/ https://www.ncbi.nlm.nih.gov/pubmed/19025689 http://dx.doi.org/10.1186/1471-2105-9-S11-S3 |
work_keys_str_mv | AT robertsangus miningclinicalrelationshipsfrompatientnarratives AT gaizauskasrobert miningclinicalrelationshipsfrompatientnarratives AT hepplemark miningclinicalrelationshipsfrompatientnarratives AT guoyikun miningclinicalrelationshipsfrompatientnarratives |