Cargando…

Term sets: A transparent and reproducible representation of clinical code sets

OBJECTIVE: Clinical code sets are vital to research using routinely-collected electronic healthcare data. Existing code set engineering methods pose significant limitations when considering reproducible research. To improve the transparency and reusability of research, these code sets must abide by...

Descripción completa

Detalles Bibliográficos
Autores principales:	Williams, Richard, Brown, Benjamin, Kontopantelis, Evan, van Staa, Tjeerd, Peek, Niels
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6375602/ https://www.ncbi.nlm.nih.gov/pubmed/30763407 http://dx.doi.org/10.1371/journal.pone.0212291

_version_	1783395391383797760
author	Williams, Richard Brown, Benjamin Kontopantelis, Evan van Staa, Tjeerd Peek, Niels
author_facet	Williams, Richard Brown, Benjamin Kontopantelis, Evan van Staa, Tjeerd Peek, Niels
author_sort	Williams, Richard
collection	PubMed
description	OBJECTIVE: Clinical code sets are vital to research using routinely-collected electronic healthcare data. Existing code set engineering methods pose significant limitations when considering reproducible research. To improve the transparency and reusability of research, these code sets must abide by FAIR principles; this is not currently happening. We propose ‘term sets’, an equivalent alternative to code sets that are findable, accessible, interoperable and reusable. MATERIALS AND METHODS: We describe a new code set representation, consisting of natural language inclusion and exclusion terms (term sets), and explain its relationship to code sets. We formally prove that any code set has a corresponding term set. We demonstrate utility by searching for recently published code sets, representing them as term sets, and reporting on the number of inclusion and exclusion terms compared with the size of the code set. RESULTS: Thirty-one code sets from 20 papers covering diverse disease domains were converted into term sets. The term sets were on average 74% the size of their equivalent original code set. Four term sets were larger due to deficiencies in the original code sets. DISCUSSION: Term sets can concisely represent any code set. This may reduce barriers for examining and reusing code sets, which may accelerate research using healthcare databases. We have developed open-source software that supports researchers using term sets. CONCLUSION: Term sets are independent of clinical code terminologies and therefore: enable reproducible research; are resistant to terminology changes; and are less error-prone as they are shorter than the equivalent code set.
format	Online Article Text
id	pubmed-6375602
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-63756022019-03-01 Term sets: A transparent and reproducible representation of clinical code sets Williams, Richard Brown, Benjamin Kontopantelis, Evan van Staa, Tjeerd Peek, Niels PLoS One Research Article OBJECTIVE: Clinical code sets are vital to research using routinely-collected electronic healthcare data. Existing code set engineering methods pose significant limitations when considering reproducible research. To improve the transparency and reusability of research, these code sets must abide by FAIR principles; this is not currently happening. We propose ‘term sets’, an equivalent alternative to code sets that are findable, accessible, interoperable and reusable. MATERIALS AND METHODS: We describe a new code set representation, consisting of natural language inclusion and exclusion terms (term sets), and explain its relationship to code sets. We formally prove that any code set has a corresponding term set. We demonstrate utility by searching for recently published code sets, representing them as term sets, and reporting on the number of inclusion and exclusion terms compared with the size of the code set. RESULTS: Thirty-one code sets from 20 papers covering diverse disease domains were converted into term sets. The term sets were on average 74% the size of their equivalent original code set. Four term sets were larger due to deficiencies in the original code sets. DISCUSSION: Term sets can concisely represent any code set. This may reduce barriers for examining and reusing code sets, which may accelerate research using healthcare databases. We have developed open-source software that supports researchers using term sets. CONCLUSION: Term sets are independent of clinical code terminologies and therefore: enable reproducible research; are resistant to terminology changes; and are less error-prone as they are shorter than the equivalent code set. Public Library of Science 2019-02-14 /pmc/articles/PMC6375602/ /pubmed/30763407 http://dx.doi.org/10.1371/journal.pone.0212291 Text en © 2019 Williams et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Williams, Richard Brown, Benjamin Kontopantelis, Evan van Staa, Tjeerd Peek, Niels Term sets: A transparent and reproducible representation of clinical code sets
title	Term sets: A transparent and reproducible representation of clinical code sets
title_full	Term sets: A transparent and reproducible representation of clinical code sets
title_fullStr	Term sets: A transparent and reproducible representation of clinical code sets
title_full_unstemmed	Term sets: A transparent and reproducible representation of clinical code sets
title_short	Term sets: A transparent and reproducible representation of clinical code sets
title_sort	term sets: a transparent and reproducible representation of clinical code sets
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6375602/ https://www.ncbi.nlm.nih.gov/pubmed/30763407 http://dx.doi.org/10.1371/journal.pone.0212291
work_keys_str_mv	AT williamsrichard termsetsatransparentandreproduciblerepresentationofclinicalcodesets AT brownbenjamin termsetsatransparentandreproduciblerepresentationofclinicalcodesets AT kontopantelisevan termsetsatransparentandreproduciblerepresentationofclinicalcodesets AT vanstaatjeerd termsetsatransparentandreproduciblerepresentationofclinicalcodesets AT peekniels termsetsatransparentandreproduciblerepresentationofclinicalcodesets

Term sets: A transparent and reproducible representation of clinical code sets

Ejemplares similares