Cargando…
A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT
BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interop...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2582792/ https://www.ncbi.nlm.nih.gov/pubmed/19007442 http://dx.doi.org/10.1186/1472-6947-8-S1-S5 |
_version_ | 1782160706757459968 |
---|---|
author | Wang, Yefeng Patrick, Jon Miller, Graeme O'Hallaran, Julie |
author_facet | Wang, Yefeng Patrick, Jon Miller, Graeme O'Hallaran, Julie |
author_sort | Wang, Yefeng |
collection | PubMed |
description | BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interoperability problems. METHODS: We present a system to automatically map an interface terminology ICPC-2 PLUS to SNOMED CT. Three steps of mapping are proposed in this system. The UMLS metathesaurus mapping utilises explicit relationships between ICPC-2 PLUS and SNOMED CT terms in the UMLS library to perform the first stage of the mapping. Computational linguistic mapping uses natural language processing techniques and lexical similarities for the second stage of mapping between terminologies. Finally, the post-coordination mapping allows one ICPC-2 PLUS term to be mapped into an aggregation of two or more SNOMED CT terms. RESULTS: A total 5,971 of all 7,410 ICPC-2 terms (80.58%) were mapped to SNOMED CT using the three stages but with different levels of accuracy. UMLS mapping achieved the mapping of 53.0% ICPC2 PLUS terms to SNOMED CT with the precision rate of 96.46% and overall recall rate of 44.89%. Lexical mapping increased the result to 60.31% and post-coordination mapping gave an increase of 20.27% in mapped terms. A manual review of a part of the mapping shows that the precision of lexical mappings is around 90%. The accuracy of post-coordination has not been evaluated yet. Unmapped terms and mismatched terms are due to the differences in the structures between ICPC-2 PLUS and SNOMED CT. Terms contained in ICPC-2 PLUS but not in SNOMED CT caused a large proportion of the failures in the mappings. CONCLUSION: Mapping terminologies to a standard vocabulary is a way to facilitate consistent medical data exchange and achieve system interoperability and data standardisation. Broad scale mapping cannot be achieved by any single method and methods based on computational linguistics can be very useful for the task. Automating as much as is possible of this process turns the searching and mapping task into a validation task, which can effectively reduce the cost and increase the efficiency and accuracy of this task over manual methods. |
format | Text |
id | pubmed-2582792 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-25827922008-11-14 A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT Wang, Yefeng Patrick, Jon Miller, Graeme O'Hallaran, Julie BMC Med Inform Decis Mak Proceedings BACKGROUND: A great challenge in sharing data across information systems in general practice is the lack of interoperability between different terminologies or coding schema used in the information systems. Mapping of medical vocabularies to a standardised terminology is needed to solve data interoperability problems. METHODS: We present a system to automatically map an interface terminology ICPC-2 PLUS to SNOMED CT. Three steps of mapping are proposed in this system. The UMLS metathesaurus mapping utilises explicit relationships between ICPC-2 PLUS and SNOMED CT terms in the UMLS library to perform the first stage of the mapping. Computational linguistic mapping uses natural language processing techniques and lexical similarities for the second stage of mapping between terminologies. Finally, the post-coordination mapping allows one ICPC-2 PLUS term to be mapped into an aggregation of two or more SNOMED CT terms. RESULTS: A total 5,971 of all 7,410 ICPC-2 terms (80.58%) were mapped to SNOMED CT using the three stages but with different levels of accuracy. UMLS mapping achieved the mapping of 53.0% ICPC2 PLUS terms to SNOMED CT with the precision rate of 96.46% and overall recall rate of 44.89%. Lexical mapping increased the result to 60.31% and post-coordination mapping gave an increase of 20.27% in mapped terms. A manual review of a part of the mapping shows that the precision of lexical mappings is around 90%. The accuracy of post-coordination has not been evaluated yet. Unmapped terms and mismatched terms are due to the differences in the structures between ICPC-2 PLUS and SNOMED CT. Terms contained in ICPC-2 PLUS but not in SNOMED CT caused a large proportion of the failures in the mappings. CONCLUSION: Mapping terminologies to a standard vocabulary is a way to facilitate consistent medical data exchange and achieve system interoperability and data standardisation. Broad scale mapping cannot be achieved by any single method and methods based on computational linguistics can be very useful for the task. Automating as much as is possible of this process turns the searching and mapping task into a validation task, which can effectively reduce the cost and increase the efficiency and accuracy of this task over manual methods. BioMed Central 2008-10-27 /pmc/articles/PMC2582792/ /pubmed/19007442 http://dx.doi.org/10.1186/1472-6947-8-S1-S5 Text en Copyright © 2008 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Wang, Yefeng Patrick, Jon Miller, Graeme O'Hallaran, Julie A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title | A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title_full | A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title_fullStr | A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title_full_unstemmed | A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title_short | A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT |
title_sort | computational linguistics motivated mapping of icpc-2 plus to snomed ct |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2582792/ https://www.ncbi.nlm.nih.gov/pubmed/19007442 http://dx.doi.org/10.1186/1472-6947-8-S1-S5 |
work_keys_str_mv | AT wangyefeng acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct AT patrickjon acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct AT millergraeme acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct AT ohallaranjulie acomputationallinguisticsmotivatedmappingoficpc2plustosnomedct AT wangyefeng computationallinguisticsmotivatedmappingoficpc2plustosnomedct AT patrickjon computationallinguisticsmotivatedmappingoficpc2plustosnomedct AT millergraeme computationallinguisticsmotivatedmappingoficpc2plustosnomedct AT ohallaranjulie computationallinguisticsmotivatedmappingoficpc2plustosnomedct |