Cargando…

Toward Complete Structured Information Extraction from Radiology Reports Using Machine Learning

Unstructured and semi-structured radiology reports represent an underutilized trove of information for machine learning (ML)-based clinical informatics applications, including abnormality tracking systems, research cohort identification, point-of-care summarization, semi-automated report writing, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Steinkamp, Jackson M., Chambers, Charles, Lalevic, Darco, Zafar, Hanna M., Cook, Tessa S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6646440/
https://www.ncbi.nlm.nih.gov/pubmed/31218554
http://dx.doi.org/10.1007/s10278-019-00234-y
Descripción
Sumario:Unstructured and semi-structured radiology reports represent an underutilized trove of information for machine learning (ML)-based clinical informatics applications, including abnormality tracking systems, research cohort identification, point-of-care summarization, semi-automated report writing, and as a source of weak data labels for training image processing systems. Clinical ML systems must be interpretable to ensure user trust. To create interpretable models applicable to all of these tasks, we can build general-purpose systems which extract all relevant human-level assertions or “facts” documented in reports; identifying these facts is an information extraction (IE) task. Previous IE work in radiology has focused on a limited set of information, and extracts isolated entities (i.e., single words such as “lesion” or “cyst”) rather than complete facts, which require the linking of multiple entities and modifiers. Here, we develop a prototype system to extract all useful information in abdominopelvic radiology reports (findings, recommendations, clinical history, procedures, imaging indications and limitations, etc.), in the form of complete, contextualized facts. We construct an information schema to capture the bulk of information in reports, develop real-time ML models to extract this information, and demonstrate the feasibility and performance of the system.