Cargando…

What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data

OBJECTIVES: To describe the benefits and limitations of using individual and combinations of linked English electronic health data to identify incident cancers. DESIGN AND SETTING: Our descriptive study uses linked English Clinical Practice Research Datalink primary care; cancer registration; hospit...

Descripción completa

Detalles Bibliográficos
Autores principales: Strongman, Helen, Williams, Rachael, Bhaskaran, Krishnan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7443310/
https://www.ncbi.nlm.nih.gov/pubmed/32819994
http://dx.doi.org/10.1136/bmjopen-2020-037719
_version_ 1783573609936060416
author Strongman, Helen
Williams, Rachael
Bhaskaran, Krishnan
author_facet Strongman, Helen
Williams, Rachael
Bhaskaran, Krishnan
author_sort Strongman, Helen
collection PubMed
description OBJECTIVES: To describe the benefits and limitations of using individual and combinations of linked English electronic health data to identify incident cancers. DESIGN AND SETTING: Our descriptive study uses linked English Clinical Practice Research Datalink primary care; cancer registration; hospitalisation and death registration data. PARTICIPANTS AND MEASURES: We implemented case definitions to identify first site-specific cancers at the 20 most common sites, based on the first ever cancer diagnosis recorded in each individual or commonly used combination of data sources between 2000 and 2014. We calculated positive predictive values and sensitivities of each definition, compared with a gold standard algorithm that used information from all linked data sets to identify first cancers. We described completeness of grade and stage information in the cancer registration data set. RESULTS: 165 953 gold standard cancers were identified. Positive predictive values of all case definitions were ≥80% and ≥94% for the four most common cancers (breast, lung, colorectal and prostate). Sensitivity for case definitions that used cancer registration alone or in combination was ≥92% for the four most common cancers and ≥80% across all cancer sites except bladder cancer (65% using cancer registration alone). For case definitions using linked primary care, hospitalisation and death registration data, sensitivity was ≥89% for the four most common cancers, and ≥80% for all cancer sites except kidney (69%), oral cavity (76%) and ovarian cancer (78%). When primary care or hospitalisation data were used alone, sensitivities were generally lower and diagnosis dates were delayed. Completeness of staging data in cancer registration data was high from 2012 (minimum 76.0% in 2012 and 86.4% in 2014 for the four most common cancers). CONCLUSIONS: Ascertainment of incident cancers was good when using cancer registration data alone or in combination with other data sets, and for the majority of cancers when using a combination of primary care, hospitalisation and death registration data.
format Online
Article
Text
id pubmed-7443310
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-74433102020-08-28 What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data Strongman, Helen Williams, Rachael Bhaskaran, Krishnan BMJ Open Oncology OBJECTIVES: To describe the benefits and limitations of using individual and combinations of linked English electronic health data to identify incident cancers. DESIGN AND SETTING: Our descriptive study uses linked English Clinical Practice Research Datalink primary care; cancer registration; hospitalisation and death registration data. PARTICIPANTS AND MEASURES: We implemented case definitions to identify first site-specific cancers at the 20 most common sites, based on the first ever cancer diagnosis recorded in each individual or commonly used combination of data sources between 2000 and 2014. We calculated positive predictive values and sensitivities of each definition, compared with a gold standard algorithm that used information from all linked data sets to identify first cancers. We described completeness of grade and stage information in the cancer registration data set. RESULTS: 165 953 gold standard cancers were identified. Positive predictive values of all case definitions were ≥80% and ≥94% for the four most common cancers (breast, lung, colorectal and prostate). Sensitivity for case definitions that used cancer registration alone or in combination was ≥92% for the four most common cancers and ≥80% across all cancer sites except bladder cancer (65% using cancer registration alone). For case definitions using linked primary care, hospitalisation and death registration data, sensitivity was ≥89% for the four most common cancers, and ≥80% for all cancer sites except kidney (69%), oral cavity (76%) and ovarian cancer (78%). When primary care or hospitalisation data were used alone, sensitivities were generally lower and diagnosis dates were delayed. Completeness of staging data in cancer registration data was high from 2012 (minimum 76.0% in 2012 and 86.4% in 2014 for the four most common cancers). CONCLUSIONS: Ascertainment of incident cancers was good when using cancer registration data alone or in combination with other data sets, and for the majority of cancers when using a combination of primary care, hospitalisation and death registration data. BMJ Publishing Group 2020-08-20 /pmc/articles/PMC7443310/ /pubmed/32819994 http://dx.doi.org/10.1136/bmjopen-2020-037719 Text en © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle Oncology
Strongman, Helen
Williams, Rachael
Bhaskaran, Krishnan
What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title_full What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title_fullStr What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title_full_unstemmed What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title_short What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data
title_sort what are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked english electronic health records data
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7443310/
https://www.ncbi.nlm.nih.gov/pubmed/32819994
http://dx.doi.org/10.1136/bmjopen-2020-037719
work_keys_str_mv AT strongmanhelen whataretheimplicationsofusingindividualandcombinedsourcesofroutinelycollecteddatatoidentifyandcharacteriseincidentsitespecificcancersaconcordanceandvalidationstudyusinglinkedenglishelectronichealthrecordsdata
AT williamsrachael whataretheimplicationsofusingindividualandcombinedsourcesofroutinelycollecteddatatoidentifyandcharacteriseincidentsitespecificcancersaconcordanceandvalidationstudyusinglinkedenglishelectronichealthrecordsdata
AT bhaskarankrishnan whataretheimplicationsofusingindividualandcombinedsourcesofroutinelycollecteddatatoidentifyandcharacteriseincidentsitespecificcancersaconcordanceandvalidationstudyusinglinkedenglishelectronichealthrecordsdata