Cargando…

A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT

BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interop...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yefeng, Patrick, Jon, Miller, Graeme, O'Hallaran, Julie
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2582792/
https://www.ncbi.nlm.nih.gov/pubmed/19007442
http://dx.doi.org/10.1186/1472-6947-8-S1-S5
_version_ 1782160706757459968
author Wang, Yefeng
Patrick, Jon
Miller, Graeme
O'Hallaran, Julie
author_facet Wang, Yefeng
Patrick, Jon
Miller, Graeme
O'Hallaran, Julie
author_sort Wang, Yefeng
collection PubMed
description BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interoperability problems. METHODS: We present a system to automatically map an interface terminology ICPC-2 PLUS to SNOMED CT. Three steps of mapping are proposed in this system. The UMLS metathesaurus mapping utilises explicit relationships between ICPC-2 PLUS and SNOMED CT terms in the UMLS library to perform the first stage of the mapping. Computational linguistic mapping uses natural language processing techniques and lexical similarities for the second stage of mapping between terminologies. Finally, the post-coordination mapping allows one ICPC-2 PLUS term to be mapped into an aggregation of two or more SNOMED CT terms. RESULTS: A total 5,971 of all 7,410 ICPC-2 terms (80.58%) were mapped to SNOMED CT using the three stages but with different levels of accuracy. UMLS mapping achieved the mapping of 53.0% ICPC2 PLUS terms to SNOMED CT with the precision rate of 96.46% and overall recall rate of 44.89%. Lexical mapping increased the result to 60.31% and post-coordination mapping gave an increase of 20.27% in mapped terms. A manual review of a part of the mapping shows that the precision of lexical mappings is around 90%. The accuracy of post-coordination has not been evaluated yet. Unmapped terms and mismatched terms are due to the differences in the structures between ICPC-2 PLUS and SNOMED CT. Terms contained in ICPC-2 PLUS but not in SNOMED CT caused a large proportion of the failures in the mappings. CONCLUSION: Mapping terminologies to a standard vocabulary is a way to facilitate consistent medical data exchange and achieve system interoperability and data standardisation. Broad scale mapping cannot be achieved by any single method and methods based on computational linguistics can be very useful for the task. Automating as much as is possible of this process turns the searching and mapping task into a validation task, which can effectively reduce the cost and increase the efficiency and accuracy of this task over manual methods.
format Text
id pubmed-2582792
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25827922008-11-14 A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT Wang, Yefeng Patrick, Jon Miller, Graeme O'Hallaran, Julie BMC Med Inform Decis Mak Proceedings BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interoperability problems. METHODS: We present a system to automatically map an interface terminology ICPC-2 PLUS to SNOMED CT. Three steps of mapping are proposed in this system. The UMLS metathesaurus mapping utilises explicit relationships between ICPC-2 PLUS and SNOMED CT terms in the UMLS library to perform the first stage of the mapping. Computational linguistic mapping uses natural language processing techniques and lexical similarities for the second stage of mapping between terminologies. Finally, the post-coordination mapping allows one ICPC-2 PLUS term to be mapped into an aggregation of two or more SNOMED CT terms. RESULTS: A total 5,971 of all 7,410 ICPC-2 terms (80.58%) were mapped to SNOMED CT using the three stages but with different levels of accuracy. UMLS mapping achieved the mapping of 53.0% ICPC2 PLUS terms to SNOMED CT with the precision rate of 96.46% and overall recall rate of 44.89%. Lexical mapping increased the result to 60.31% and post-coordination mapping gave an increase of 20.27% in mapped terms. A manual review of a part of the mapping shows that the precision of lexical mappings is around 90%. The accuracy of post-coordination has not been evaluated yet. Unmapped terms and mismatched terms are due to the differences in the structures between ICPC-2 PLUS and SNOMED CT. Terms contained in ICPC-2 PLUS but not in SNOMED CT caused a large proportion of the failures in the mappings. CONCLUSION: Mapping terminologies to a standard vocabulary is a way to facilitate consistent medical data exchange and achieve system interoperability and data standardisation. Broad scale mapping cannot be achieved by any single method and methods based on computational linguistics can be very useful for the task. Automating as much as is possible of this process turns the searching and mapping task into a validation task, which can effectively reduce the cost and increase the efficiency and accuracy of this task over manual methods. BioMed Central 2008-10-27 /pmc/articles/PMC2582792/ /pubmed/19007442 http://dx.doi.org/10.1186/1472-6947-8-S1-S5 Text en Copyright © 2008 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Wang, Yefeng
Patrick, Jon
Miller, Graeme
O'Hallaran, Julie
A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title_full A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title_fullStr A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title_full_unstemmed A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title_short A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
title_sort computational linguistics motivated mapping of icpc-2 plus to snomed ct
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2582792/
https://www.ncbi.nlm.nih.gov/pubmed/19007442
http://dx.doi.org/10.1186/1472-6947-8-S1-S5
work_keys_str_mv AT wangyefeng acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT patrickjon acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT millergraeme acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT ohallaranjulie acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT wangyefeng computationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT patrickjon computationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT millergraeme computationallinguisticsmotivatedmappingoficpc2plustosnomedct
AT ohallaranjulie computationallinguisticsmotivatedmappingoficpc2plustosnomedct