Cargando…

Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France

Electronic Medical Records (EMR) and Electronic Health Records (EHR) are often missing critical information about the death of a patient, although it is an essential metric for medical research in oncology to assess survival outcomes, particularly for evaluating the efficacy of new therapeutic appro...

Descripción completa

Detalles Bibliográficos
Autores principales: Lauzanne, Olivier, Frenel, Jean-Sébastien, Baziz, Mustapha, Campone, Mario, Raimbourg, Judith, Bocquet, François
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8998644/
https://www.ncbi.nlm.nih.gov/pubmed/35409956
http://dx.doi.org/10.3390/ijerph19074272
_version_ 1784684992592347136
author Lauzanne, Olivier
Frenel, Jean-Sébastien
Baziz, Mustapha
Campone, Mario
Raimbourg, Judith
Bocquet, François
author_facet Lauzanne, Olivier
Frenel, Jean-Sébastien
Baziz, Mustapha
Campone, Mario
Raimbourg, Judith
Bocquet, François
author_sort Lauzanne, Olivier
collection PubMed
description Electronic Medical Records (EMR) and Electronic Health Records (EHR) are often missing critical information about the death of a patient, although it is an essential metric for medical research in oncology to assess survival outcomes, particularly for evaluating the efficacy of new therapeutic approaches. We used open government data in France from 1970 to September 2021 to identify deceased patients and match them with patient data collected from the Institut de Cancérologie de l’Ouest (ICO) data warehouse (Integrated Center of Oncology—the third largest cancer center in France) between January 2015 and November 2021. To meet our objective, we evaluated algorithms to perform a deterministic record linkage: an exact matching algorithm and a fuzzy matching algorithm. Because we lacked reference data, we needed to assess the algorithms by estimating the number of homonyms that could lead to false links, using the same open dataset of deceased persons in France. The exact matching algorithm allowed us to double the number of dates of death in the ICO data warehouse, and the fuzzy matching algorithm tripled it. Studying homonyms assured us that there was a low risk of misidentification, with precision values of 99.96% for the exact matching and 99.68% for the fuzzy matching. However, estimating the number of false negatives proved more difficult than anticipated. Nevertheless, using open government data can be a highly interesting way to improve the completeness of the date of death variable for oncology patients in data warehouses
format Online
Article
Text
id pubmed-8998644
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89986442022-04-12 Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France Lauzanne, Olivier Frenel, Jean-Sébastien Baziz, Mustapha Campone, Mario Raimbourg, Judith Bocquet, François Int J Environ Res Public Health Review Electronic Medical Records (EMR) and Electronic Health Records (EHR) are often missing critical information about the death of a patient, although it is an essential metric for medical research in oncology to assess survival outcomes, particularly for evaluating the efficacy of new therapeutic approaches. We used open government data in France from 1970 to September 2021 to identify deceased patients and match them with patient data collected from the Institut de Cancérologie de l’Ouest (ICO) data warehouse (Integrated Center of Oncology—the third largest cancer center in France) between January 2015 and November 2021. To meet our objective, we evaluated algorithms to perform a deterministic record linkage: an exact matching algorithm and a fuzzy matching algorithm. Because we lacked reference data, we needed to assess the algorithms by estimating the number of homonyms that could lead to false links, using the same open dataset of deceased persons in France. The exact matching algorithm allowed us to double the number of dates of death in the ICO data warehouse, and the fuzzy matching algorithm tripled it. Studying homonyms assured us that there was a low risk of misidentification, with precision values of 99.96% for the exact matching and 99.68% for the fuzzy matching. However, estimating the number of false negatives proved more difficult than anticipated. Nevertheless, using open government data can be a highly interesting way to improve the completeness of the date of death variable for oncology patients in data warehouses MDPI 2022-04-02 /pmc/articles/PMC8998644/ /pubmed/35409956 http://dx.doi.org/10.3390/ijerph19074272 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Lauzanne, Olivier
Frenel, Jean-Sébastien
Baziz, Mustapha
Campone, Mario
Raimbourg, Judith
Bocquet, François
Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title_full Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title_fullStr Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title_full_unstemmed Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title_short Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France
title_sort optimizing the retrieval of the vital status of cancer patients for health data warehouses by using open government data in france
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8998644/
https://www.ncbi.nlm.nih.gov/pubmed/35409956
http://dx.doi.org/10.3390/ijerph19074272
work_keys_str_mv AT lauzanneolivier optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance
AT freneljeansebastien optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance
AT bazizmustapha optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance
AT camponemario optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance
AT raimbourgjudith optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance
AT bocquetfrancois optimizingtheretrievalofthevitalstatusofcancerpatientsforhealthdatawarehousesbyusingopengovernmentdatainfrance