Cargando…
Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus
BACKGROUND: Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are use...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294908/ https://www.ncbi.nlm.nih.gov/pubmed/28173841 http://dx.doi.org/10.1186/s13326-017-0114-4 |
_version_ | 1782505330815533056 |
---|---|
author | Jouhet, Vianney Mougin, Fleur Bréchat, Bérénice Thiessard, Frantz |
author_facet | Jouhet, Vianney Mougin, Fleur Bréchat, Bérénice Thiessard, Frantz |
author_sort | Jouhet, Vianney |
collection | PubMed |
description | BACKGROUND: Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a “derivative” model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the “derivative” model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program. RESULTS: The model integrated 42% of neoplasm ICD-10 codes (excluding metastasis), 98% of ICD-O-3 morphology codes (excluding metastasis) and 68% of ICD-O-3 topography codes. For every codes instantiating at least a class in the “derivative” model, comparison with SEER mappings reveals that all mappings were actually available in the model as a link between the corresponding codes. CONCLUSIONS: We have proposed a method to automatically build a model for integrating ICD-10 and ICD-O-3 based on the NCIt. The resulting “derivative” model is a machine understandable resource that enables an integrated view of these heterogeneous terminologies. The NCIt structure and the available relationships can help to bridge disease classifications taking into account their structural and granular heterogeneities. However, (i) inconsistencies exist within the NCIt leading to misclassifications in the “derivative” model, (ii) the “derivative” model only integrates a part of ICD-10 and ICD-O-3. The NCIt is not sufficient for integration purpose and further work based on other termino-ontological resources is needed in order to enrich the model and avoid identified inconsistencies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0114-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5294908 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52949082017-02-09 Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus Jouhet, Vianney Mougin, Fleur Bréchat, Bérénice Thiessard, Frantz J Biomed Semantics Research BACKGROUND: Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a “derivative” model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the “derivative” model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program. RESULTS: The model integrated 42% of neoplasm ICD-10 codes (excluding metastasis), 98% of ICD-O-3 morphology codes (excluding metastasis) and 68% of ICD-O-3 topography codes. For every codes instantiating at least a class in the “derivative” model, comparison with SEER mappings reveals that all mappings were actually available in the model as a link between the corresponding codes. CONCLUSIONS: We have proposed a method to automatically build a model for integrating ICD-10 and ICD-O-3 based on the NCIt. The resulting “derivative” model is a machine understandable resource that enables an integrated view of these heterogeneous terminologies. The NCIt structure and the available relationships can help to bridge disease classifications taking into account their structural and granular heterogeneities. However, (i) inconsistencies exist within the NCIt leading to misclassifications in the “derivative” model, (ii) the “derivative” model only integrates a part of ICD-10 and ICD-O-3. The NCIt is not sufficient for integration purpose and further work based on other termino-ontological resources is needed in order to enrich the model and avoid identified inconsistencies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0114-4) contains supplementary material, which is available to authorized users. BioMed Central 2017-02-07 /pmc/articles/PMC5294908/ /pubmed/28173841 http://dx.doi.org/10.1186/s13326-017-0114-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Jouhet, Vianney Mougin, Fleur Bréchat, Bérénice Thiessard, Frantz Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title | Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title_full | Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title_fullStr | Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title_full_unstemmed | Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title_short | Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
title_sort | building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294908/ https://www.ncbi.nlm.nih.gov/pubmed/28173841 http://dx.doi.org/10.1186/s13326-017-0114-4 |
work_keys_str_mv | AT jouhetvianney buildingamodelfordiseaseclassificationintegrationinoncologyanapproachbasedonthenationalcancerinstitutethesaurus AT mouginfleur buildingamodelfordiseaseclassificationintegrationinoncologyanapproachbasedonthenationalcancerinstitutethesaurus AT brechatberenice buildingamodelfordiseaseclassificationintegrationinoncologyanapproachbasedonthenationalcancerinstitutethesaurus AT thiessardfrantz buildingamodelfordiseaseclassificationintegrationinoncologyanapproachbasedonthenationalcancerinstitutethesaurus |