Cargando…

Tumor information extraction in radiology reports for hepatocellular carcinoma patients

Hepatocellular carcinoma (HCC) is a deadly disease affecting the liver for which there are many available therapies. Targeting treatments towards specific patient groups necessitates defining patients by stage of disease. Criteria for such stagings include information on tumor number, size, and anat...

Descripción completa

Detalles Bibliográficos
Autores principales: Yim, Wen-wai, Denman, Tyler, Kwan, Sharon W., Yetisgen, Meliha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001784/
https://www.ncbi.nlm.nih.gov/pubmed/27570686
_version_ 1782450484303364096
author Yim, Wen-wai
Denman, Tyler
Kwan, Sharon W.
Yetisgen, Meliha
author_facet Yim, Wen-wai
Denman, Tyler
Kwan, Sharon W.
Yetisgen, Meliha
author_sort Yim, Wen-wai
collection PubMed
description Hepatocellular carcinoma (HCC) is a deadly disease affecting the liver for which there are many available therapies. Targeting treatments towards specific patient groups necessitates defining patients by stage of disease. Criteria for such stagings include information on tumor number, size, and anatomic location, typically only found in narrative clinical text in the electronic medical record (EMR). Natural language processing (NLP) offers an automatic and scale-able means to extract this information, which can further evidence-based research. In this paper, we created a corpus of 101 radiology reports annotated for tumor information. Afterwards we applied machine learning algorithms to extract tumor information. Our inter-annotator partial match agreement scored at 0.93 and 0.90 F1 for entities and relations, respectively. Based on the annotated corpus, our sequential labeling entity extraction achieved 0.87 F1 partial match, and our maximum entropy classification relation extraction achieved scores 0.89 and 0. 74 F1 with gold and system entities, respectively.
format Online
Article
Text
id pubmed-5001784
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-50017842016-08-26 Tumor information extraction in radiology reports for hepatocellular carcinoma patients Yim, Wen-wai Denman, Tyler Kwan, Sharon W. Yetisgen, Meliha AMIA Jt Summits Transl Sci Proc Articles Hepatocellular carcinoma (HCC) is a deadly disease affecting the liver for which there are many available therapies. Targeting treatments towards specific patient groups necessitates defining patients by stage of disease. Criteria for such stagings include information on tumor number, size, and anatomic location, typically only found in narrative clinical text in the electronic medical record (EMR). Natural language processing (NLP) offers an automatic and scale-able means to extract this information, which can further evidence-based research. In this paper, we created a corpus of 101 radiology reports annotated for tumor information. Afterwards we applied machine learning algorithms to extract tumor information. Our inter-annotator partial match agreement scored at 0.93 and 0.90 F1 for entities and relations, respectively. Based on the annotated corpus, our sequential labeling entity extraction achieved 0.87 F1 partial match, and our maximum entropy classification relation extraction achieved scores 0.89 and 0. 74 F1 with gold and system entities, respectively. American Medical Informatics Association 2016-07-20 /pmc/articles/PMC5001784/ /pubmed/27570686 Text en ©2016 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose
spellingShingle Articles
Yim, Wen-wai
Denman, Tyler
Kwan, Sharon W.
Yetisgen, Meliha
Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title_full Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title_fullStr Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title_full_unstemmed Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title_short Tumor information extraction in radiology reports for hepatocellular carcinoma patients
title_sort tumor information extraction in radiology reports for hepatocellular carcinoma patients
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001784/
https://www.ncbi.nlm.nih.gov/pubmed/27570686
work_keys_str_mv AT yimwenwai tumorinformationextractioninradiologyreportsforhepatocellularcarcinomapatients
AT denmantyler tumorinformationextractioninradiologyreportsforhepatocellularcarcinomapatients
AT kwansharonw tumorinformationextractioninradiologyreportsforhepatocellularcarcinomapatients
AT yetisgenmeliha tumorinformationextractioninradiologyreportsforhepatocellularcarcinomapatients