Cargando…

Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?

BACKGROUND: Primary care databases are a major source of data for epidemiological and health services research. However, most studies are based on coded information, ignoring information stored in free text. Using the early presentation of rheumatoid arthritis (RA) as an exemplar, our objective was...

Descripción completa

Detalles Bibliográficos
Autores principales: Ford, Elizabeth, Nicholson, Amanda, Koeling, Rob, Tate, A Rosemary, Carroll, John, Axelrod, Lesley, Smith, Helen E, Rait, Greta, Davies, Kevin A, Petersen, Irene, Williams, Tim, Cassell, Jackie A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765394/
https://www.ncbi.nlm.nih.gov/pubmed/23964710
http://dx.doi.org/10.1186/1471-2288-13-105
_version_ 1782283298990456832
author Ford, Elizabeth
Nicholson, Amanda
Koeling, Rob
Tate, A Rosemary
Carroll, John
Axelrod, Lesley
Smith, Helen E
Rait, Greta
Davies, Kevin A
Petersen, Irene
Williams, Tim
Cassell, Jackie A
author_facet Ford, Elizabeth
Nicholson, Amanda
Koeling, Rob
Tate, A Rosemary
Carroll, John
Axelrod, Lesley
Smith, Helen E
Rait, Greta
Davies, Kevin A
Petersen, Irene
Williams, Tim
Cassell, Jackie A
author_sort Ford, Elizabeth
collection PubMed
description BACKGROUND: Primary care databases are a major source of data for epidemiological and health services research. However, most studies are based on coded information, ignoring information stored in free text. Using the early presentation of rheumatoid arthritis (RA) as an exemplar, our objective was to estimate the extent of data hidden within free text, using a keyword search. METHODS: We examined the electronic health records (EHRs) of 6,387 patients from the UK, aged 30 years and older, with a first coded diagnosis of RA between 2005 and 2008. We listed indicators for RA which were present in coded format and ran keyword searches for similar information held in free text. The frequency of indicator code groups and keywords from one year before to 14 days after RA diagnosis were compared, and temporal relationships examined. RESULTS: One or more keyword for RA was found in the free text in 29% of patients prior to the RA diagnostic code. Keywords for inflammatory arthritis diagnoses were present for 14% of patients whereas only 11% had a diagnostic code. Codes for synovitis were found in 3% of patients, but keywords were identified in an additional 17%. In 13% of patients there was evidence of a positive rheumatoid factor test in text only, uncoded. No gender differences were found. Keywords generally occurred close in time to the coded diagnosis of rheumatoid arthritis. They were often found under codes indicating letters and communications. CONCLUSIONS: Potential cases may be missed or wrongly dated when coded data alone are used to identify patients with RA, as diagnostic suspicions are frequently confined to text. The use of EHRs to create disease registers or assess quality of care will be misleading if free text information is not taken into account. Methods to facilitate the automated processing of text need to be developed and implemented.
format Online
Article
Text
id pubmed-3765394
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37653942013-09-07 Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text? Ford, Elizabeth Nicholson, Amanda Koeling, Rob Tate, A Rosemary Carroll, John Axelrod, Lesley Smith, Helen E Rait, Greta Davies, Kevin A Petersen, Irene Williams, Tim Cassell, Jackie A BMC Med Res Methodol Research Article BACKGROUND: Primary care databases are a major source of data for epidemiological and health services research. However, most studies are based on coded information, ignoring information stored in free text. Using the early presentation of rheumatoid arthritis (RA) as an exemplar, our objective was to estimate the extent of data hidden within free text, using a keyword search. METHODS: We examined the electronic health records (EHRs) of 6,387 patients from the UK, aged 30 years and older, with a first coded diagnosis of RA between 2005 and 2008. We listed indicators for RA which were present in coded format and ran keyword searches for similar information held in free text. The frequency of indicator code groups and keywords from one year before to 14 days after RA diagnosis were compared, and temporal relationships examined. RESULTS: One or more keyword for RA was found in the free text in 29% of patients prior to the RA diagnostic code. Keywords for inflammatory arthritis diagnoses were present for 14% of patients whereas only 11% had a diagnostic code. Codes for synovitis were found in 3% of patients, but keywords were identified in an additional 17%. In 13% of patients there was evidence of a positive rheumatoid factor test in text only, uncoded. No gender differences were found. Keywords generally occurred close in time to the coded diagnosis of rheumatoid arthritis. They were often found under codes indicating letters and communications. CONCLUSIONS: Potential cases may be missed or wrongly dated when coded data alone are used to identify patients with RA, as diagnostic suspicions are frequently confined to text. The use of EHRs to create disease registers or assess quality of care will be misleading if free text information is not taken into account. Methods to facilitate the automated processing of text need to be developed and implemented. BioMed Central 2013-08-21 /pmc/articles/PMC3765394/ /pubmed/23964710 http://dx.doi.org/10.1186/1471-2288-13-105 Text en Copyright © 2013 Ford et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ford, Elizabeth
Nicholson, Amanda
Koeling, Rob
Tate, A Rosemary
Carroll, John
Axelrod, Lesley
Smith, Helen E
Rait, Greta
Davies, Kevin A
Petersen, Irene
Williams, Tim
Cassell, Jackie A
Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title_full Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title_fullStr Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title_full_unstemmed Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title_short Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
title_sort optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765394/
https://www.ncbi.nlm.nih.gov/pubmed/23964710
http://dx.doi.org/10.1186/1471-2288-13-105
work_keys_str_mv AT fordelizabeth optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT nicholsonamanda optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT koelingrob optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT tatearosemary optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT carrolljohn optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT axelrodlesley optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT smithhelene optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT raitgreta optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT davieskevina optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT petersenirene optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT williamstim optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext
AT casselljackiea optimisingtheuseofelectronichealthrecordstoestimatetheincidenceofrheumatoidarthritisinprimarycarewhatinformationishiddeninfreetext