Cargando…

Tumor information extraction in radiology reports for hepatocellular carcinoma patients

Hepatocellular carcinoma (HCC) is a deadly disease affecting the liver for which there are many available therapies. Targeting treatments towards specific patient groups necessitates defining patients by stage of disease. Criteria for such stagings include information on tumor number, size, and anat...

Descripción completa

Detalles Bibliográficos
Autores principales: Yim, Wen-wai, Denman, Tyler, Kwan, Sharon W., Yetisgen, Meliha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001784/
https://www.ncbi.nlm.nih.gov/pubmed/27570686
Descripción
Sumario:Hepatocellular carcinoma (HCC) is a deadly disease affecting the liver for which there are many available therapies. Targeting treatments towards specific patient groups necessitates defining patients by stage of disease. Criteria for such stagings include information on tumor number, size, and anatomic location, typically only found in narrative clinical text in the electronic medical record (EMR). Natural language processing (NLP) offers an automatic and scale-able means to extract this information, which can further evidence-based research. In this paper, we created a corpus of 101 radiology reports annotated for tumor information. Afterwards we applied machine learning algorithms to extract tumor information. Our inter-annotator partial match agreement scored at 0.93 and 0.90 F1 for entities and relations, respectively. Based on the annotated corpus, our sequential labeling entity extraction achieved 0.87 F1 partial match, and our maximum entropy classification relation extraction achieved scores 0.89 and 0. 74 F1 with gold and system entities, respectively.