Cargando…
Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, includin...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7012444/ https://www.ncbi.nlm.nih.gov/pubmed/32045428 http://dx.doi.org/10.1371/journal.pone.0228545 |
_version_ | 1783496235291770880 |
---|---|
author | Thayer, Daniel Rees, Arfon Kennedy, Jon Collins, Huw Harris, Dan Halcox, Julian Ruschetti, Luca Noyce, Richard Brooks, Caroline |
author_facet | Thayer, Daniel Rees, Arfon Kennedy, Jon Collins, Huw Harris, Dan Halcox, Julian Ruschetti, Luca Noyce, Richard Brooks, Caroline |
author_sort | Thayer, Daniel |
collection | PubMed |
description | A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, including both life events and characteristics of the datasets themselves. An automated, customizable method of determining individuals’ presence was developed for the primary care dataset in Swansea University’s SAIL Databank. The primary care dataset covers only a portion of Wales, with 76% of practices participating. The start and end date of the data varies by practice. Additionally, individuals can change practices or leave Wales. To address these issues, a two step process was developed. First, the period for which each practice had data available was calculated by measuring changes in the rate of events recorded over time. Second, the registration records for each individual were simplified. Anomalies such as short gaps and overlaps were resolved by applying a set of rules. The result of these two analyses was a cleaned set of records indicating start and end dates of available primary care data for each individual. Analysis of GP records showed that 91.0% of events occurred within periods calculated as having available data by the algorithm. 98.4% of those events were observed at the same practice of registration as that computed by the algorithm. A standardized method for solving this common problem has enabled faster development of studies using this data set. Using a rigorous, tested, standardized method of verifying presence in the study population will also positively influence the quality of research. |
format | Online Article Text |
id | pubmed-7012444 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-70124442020-02-21 Measuring follow-up time in routinely-collected health datasets: Challenges and solutions Thayer, Daniel Rees, Arfon Kennedy, Jon Collins, Huw Harris, Dan Halcox, Julian Ruschetti, Luca Noyce, Richard Brooks, Caroline PLoS One Research Article A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, including both life events and characteristics of the datasets themselves. An automated, customizable method of determining individuals’ presence was developed for the primary care dataset in Swansea University’s SAIL Databank. The primary care dataset covers only a portion of Wales, with 76% of practices participating. The start and end date of the data varies by practice. Additionally, individuals can change practices or leave Wales. To address these issues, a two step process was developed. First, the period for which each practice had data available was calculated by measuring changes in the rate of events recorded over time. Second, the registration records for each individual were simplified. Anomalies such as short gaps and overlaps were resolved by applying a set of rules. The result of these two analyses was a cleaned set of records indicating start and end dates of available primary care data for each individual. Analysis of GP records showed that 91.0% of events occurred within periods calculated as having available data by the algorithm. 98.4% of those events were observed at the same practice of registration as that computed by the algorithm. A standardized method for solving this common problem has enabled faster development of studies using this data set. Using a rigorous, tested, standardized method of verifying presence in the study population will also positively influence the quality of research. Public Library of Science 2020-02-11 /pmc/articles/PMC7012444/ /pubmed/32045428 http://dx.doi.org/10.1371/journal.pone.0228545 Text en © 2020 Thayer et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Thayer, Daniel Rees, Arfon Kennedy, Jon Collins, Huw Harris, Dan Halcox, Julian Ruschetti, Luca Noyce, Richard Brooks, Caroline Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title | Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title_full | Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title_fullStr | Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title_full_unstemmed | Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title_short | Measuring follow-up time in routinely-collected health datasets: Challenges and solutions |
title_sort | measuring follow-up time in routinely-collected health datasets: challenges and solutions |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7012444/ https://www.ncbi.nlm.nih.gov/pubmed/32045428 http://dx.doi.org/10.1371/journal.pone.0228545 |
work_keys_str_mv | AT thayerdaniel measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT reesarfon measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT kennedyjon measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT collinshuw measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT harrisdan measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT halcoxjulian measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT ruschettiluca measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT noycerichard measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions AT brookscaroline measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions |