Cargando…
Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database
OBJECTIVES: The transition from ICD-9 to ICD-10 coding creates a data standardisation challenge for large-scale longitudinal research. We sought to develop a programme that automated this standardisation process. METHODS: A programme was developed to standardise ICD-9 and ICD-10 terminology into one...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9047883/ https://www.ncbi.nlm.nih.gov/pubmed/35477690 http://dx.doi.org/10.1136/bmjhci-2021-100532 |
_version_ | 1784695821065781248 |
---|---|
author | Osorio, Robert C Raygor, Kunal P Abla, Adib A |
author_facet | Osorio, Robert C Raygor, Kunal P Abla, Adib A |
author_sort | Osorio, Robert C |
collection | PubMed |
description | OBJECTIVES: The transition from ICD-9 to ICD-10 coding creates a data standardisation challenge for large-scale longitudinal research. We sought to develop a programme that automated this standardisation process. METHODS: A programme was developed to standardise ICD-9 and ICD-10 terminology into one system. Code was improved to reduce runtime, and two iterations were tested on a joint ICD-9/ICD-10 database of 15.8 million patients. RESULTS: Both programmes successfully standardised diagnostic terminology in the database. While the original programme updated 100 000 cells in 12.5 hours, the improved programme translated 3.1 million cells in 38 min. DISCUSSION: While both programmes successfully translated ICD-related data into a standardised format, the original programme suffered from excessive runtimes. Code improvement with hash tables and parallelisation exponentially reduced these runtimes. CONCLUSION: Databases with ICD-9 and ICD-10 codes require terminology standardisation for analysis. By sharing our programme’s implementation, we hope to assist other researchers in standardising their own databases. |
format | Online Article Text |
id | pubmed-9047883 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-90478832022-05-11 Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database Osorio, Robert C Raygor, Kunal P Abla, Adib A BMJ Health Care Inform Implementer Report OBJECTIVES: The transition from ICD-9 to ICD-10 coding creates a data standardisation challenge for large-scale longitudinal research. We sought to develop a programme that automated this standardisation process. METHODS: A programme was developed to standardise ICD-9 and ICD-10 terminology into one system. Code was improved to reduce runtime, and two iterations were tested on a joint ICD-9/ICD-10 database of 15.8 million patients. RESULTS: Both programmes successfully standardised diagnostic terminology in the database. While the original programme updated 100 000 cells in 12.5 hours, the improved programme translated 3.1 million cells in 38 min. DISCUSSION: While both programmes successfully translated ICD-related data into a standardised format, the original programme suffered from excessive runtimes. Code improvement with hash tables and parallelisation exponentially reduced these runtimes. CONCLUSION: Databases with ICD-9 and ICD-10 codes require terminology standardisation for analysis. By sharing our programme’s implementation, we hope to assist other researchers in standardising their own databases. BMJ Publishing Group 2022-04-26 /pmc/articles/PMC9047883/ /pubmed/35477690 http://dx.doi.org/10.1136/bmjhci-2021-100532 Text en © Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Implementer Report Osorio, Robert C Raygor, Kunal P Abla, Adib A Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title | Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title_full | Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title_fullStr | Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title_full_unstemmed | Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title_short | Development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
title_sort | development of a customised programme to standardise comorbidity diagnosis codes in a large-scale database |
topic | Implementer Report |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9047883/ https://www.ncbi.nlm.nih.gov/pubmed/35477690 http://dx.doi.org/10.1136/bmjhci-2021-100532 |
work_keys_str_mv | AT osoriorobertc developmentofacustomisedprogrammetostandardisecomorbiditydiagnosiscodesinalargescaledatabase AT raygorkunalp developmentofacustomisedprogrammetostandardisecomorbiditydiagnosiscodesinalargescaledatabase AT ablaadiba developmentofacustomisedprogrammetostandardisecomorbiditydiagnosiscodesinalargescaledatabase |