Cargando…
Translating and evaluating historic phenotyping algorithms using SNOMED CT
OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clin...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9846670/ https://www.ncbi.nlm.nih.gov/pubmed/36083213 http://dx.doi.org/10.1093/jamia/ocac158 |
_version_ | 1784871244316803072 |
---|---|
author | Elkheder, Musaab Gonzalez-Izquierdo, Arturo Qummer Ul Arfeen, Muhammad Kuan, Valerie Lumbers, R Thomas Denaxas, Spiros Shah, Anoop D |
author_facet | Elkheder, Musaab Gonzalez-Izquierdo, Arturo Qummer Ul Arfeen, Muhammad Kuan, Valerie Lumbers, R Thomas Denaxas, Spiros Shah, Anoop D |
author_sort | Elkheder, Musaab |
collection | PubMed |
description | OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: “primary” (primary concept and its descendants), “extended” (primary concept, descendants, and additional relations), and “value set” (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to “gold standard” manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The “value set” and “extended” codelists had slightly greater recall but lower precision than “primary” codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists. |
format | Online Article Text |
id | pubmed-9846670 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98466702023-01-20 Translating and evaluating historic phenotyping algorithms using SNOMED CT Elkheder, Musaab Gonzalez-Izquierdo, Arturo Qummer Ul Arfeen, Muhammad Kuan, Valerie Lumbers, R Thomas Denaxas, Spiros Shah, Anoop D J Am Med Inform Assoc Research and Applications OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine—Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: “primary” (primary concept and its descendants), “extended” (primary concept, descendants, and additional relations), and “value set” (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to “gold standard” manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The “value set” and “extended” codelists had slightly greater recall but lower precision than “primary” codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists. Oxford University Press 2022-09-09 /pmc/articles/PMC9846670/ /pubmed/36083213 http://dx.doi.org/10.1093/jamia/ocac158 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research and Applications Elkheder, Musaab Gonzalez-Izquierdo, Arturo Qummer Ul Arfeen, Muhammad Kuan, Valerie Lumbers, R Thomas Denaxas, Spiros Shah, Anoop D Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title | Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title_full | Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title_fullStr | Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title_full_unstemmed | Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title_short | Translating and evaluating historic phenotyping algorithms using SNOMED CT |
title_sort | translating and evaluating historic phenotyping algorithms using snomed ct |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9846670/ https://www.ncbi.nlm.nih.gov/pubmed/36083213 http://dx.doi.org/10.1093/jamia/ocac158 |
work_keys_str_mv | AT elkhedermusaab translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT gonzalezizquierdoarturo translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT qummerularfeenmuhammad translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT kuanvalerie translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT lumbersrthomas translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT denaxasspiros translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct AT shahanoopd translatingandevaluatinghistoricphenotypingalgorithmsusingsnomedct |