Cargando…

Translating and evaluating historic phenotyping algorithms using SNOMED CT

OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clin...

Descripción completa

Detalles Bibliográficos
Autores principales: Elkheder, Musaab, Gonzalez-Izquierdo, Arturo, Qummer Ul Arfeen, Muhammad, Kuan, Valerie, Lumbers, R Thomas, Denaxas, Spiros, Shah, Anoop D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9846670/
https://www.ncbi.nlm.nih.gov/pubmed/36083213
http://dx.doi.org/10.1093/jamia/ocac158
_version_ 1784871244316803072
author Elkheder, Musaab
Gonzalez-Izquierdo, Arturo
Qummer Ul Arfeen, Muhammad
Kuan, Valerie
Lumbers, R Thomas
Denaxas, Spiros
Shah, Anoop D
author_facet Elkheder, Musaab
Gonzalez-Izquierdo, Arturo
Qummer Ul Arfeen, Muhammad
Kuan, Valerie
Lumbers, R Thomas
Denaxas, Spiros
Shah, Anoop D
author_sort Elkheder, Musaab
collection PubMed
description OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: “primary” (primary concept and its descendants), “extended” (primary concept, descendants, and additional relations), and “value set” (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to “gold standard” manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The “value set” and “extended” codelists had slightly greater recall but lower precision than “primary” codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists.
format Online
Article
Text
id pubmed-9846670
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98466702023-01-20 Translating and evaluating historic phenotyping algorithms using SNOMED CT Elkheder, Musaab Gonzalez-Izquierdo, Arturo Qummer Ul Arfeen, Muhammad Kuan, Valerie Lumbers, R Thomas Denaxas, Spiros Shah, Anoop D J Am Med Inform Assoc Research and Applications OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: “primary” (primary concept and its descendants), “extended” (primary concept, descendants, and additional relations), and “value set” (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to “gold standard” manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The “value set” and “extended” codelists had slightly greater recall but lower precision than “primary” codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists. Oxford University Press 2022-09-09 /pmc/articles/PMC9846670/ /pubmed/36083213 http://dx.doi.org/10.1093/jamia/ocac158 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research and Applications
Elkheder, Musaab
Gonzalez-Izquierdo, Arturo
Qummer Ul Arfeen, Muhammad
Kuan, Valerie
Lumbers, R Thomas
Denaxas, Spiros
Shah, Anoop D
Translating and evaluating historic phenotyping algorithms using SNOMED CT
title Translating and evaluating historic phenotyping algorithms using SNOMED CT
title_full Translating and evaluating historic phenotyping algorithms using SNOMED CT
title_fullStr Translating and evaluating historic phenotyping algorithms using SNOMED CT
title_full_unstemmed Translating and evaluating historic phenotyping algorithms using SNOMED CT
title_short Translating and evaluating historic phenotyping algorithms using SNOMED CT
title_sort translating and evaluating historic phenotyping algorithms using snomed ct
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9846670/
https://www.ncbi.nlm.nih.gov/pubmed/36083213
http://dx.doi.org/10.1093/jamia/ocac158
work_keys_str_mv AT elkhedermusaab translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT gonzalezizquierdoarturo translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT qummerularfeenmuhammad translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT kuanvalerie translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT lumbersrthomas translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT denaxasspiros translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct
AT shahanoopd translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct