Cargando…

Data integration of National Dose Registry and survey data using multivariate imputation by chained equations

INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 201...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Ryu Kyung, Kim, Young Min, Lee, Won Jin, Im, Jongho, Lee, Juhee, Bang, Ye Jin, Cha, Eun Shil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200363/
https://www.ncbi.nlm.nih.gov/pubmed/35704606
http://dx.doi.org/10.1371/journal.pone.0261534
_version_ 1784728045622394880
author Kim, Ryu Kyung
Kim, Young Min
Lee, Won Jin
Im, Jongho
Lee, Juhee
Bang, Ye Jin
Cha, Eun Shil
author_facet Kim, Ryu Kyung
Kim, Young Min
Lee, Won Jin
Im, Jongho
Lee, Juhee
Bang, Ye Jin
Cha, Eun Shil
author_sort Kim, Ryu Kyung
collection PubMed
description INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 2011, linked with mortality and cancer registry data. (https://kdca.go.kr/) Survey was conducted during 2012-2013 using self-reported questionnaire on occupational radiation practices among diagnostic medical radiation workers. METHODS: Data integration of NDR and Survey was performed using the multivariate imputation by chained equations (MICE) algorithm. RESULTS: The results were compared by sex and type of job because characteristics of target variables for imputation depend on these variables. There was a difference between the observed and pooled mean for the frequency of interventional therapy for nurses due to different type of medical facility distribution between observed and completed data. Concerning the marital status of males and females, and status of pregnancy for females, there was a difference between observed and pooled mean because the distribution of the year of birth was different between the observed and completed data. For lifetime status of smoking, the percentage of smoking experience was higher in the completed data than in the observed data, which could be due to reasons, such as underreporting among females and the distribution difference in the frequency of drinking between the observed and completed data for males. CONCLUSION: Data integration can allow us to obtain survey information of NDR units without additional surveys, saving us time and costs for the survey.
format Online
Article
Text
id pubmed-9200363
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-92003632022-06-16 Data integration of National Dose Registry and survey data using multivariate imputation by chained equations Kim, Ryu Kyung Kim, Young Min Lee, Won Jin Im, Jongho Lee, Juhee Bang, Ye Jin Cha, Eun Shil PLoS One Research Article INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 2011, linked with mortality and cancer registry data. (https://kdca.go.kr/) Survey was conducted during 2012-2013 using self-reported questionnaire on occupational radiation practices among diagnostic medical radiation workers. METHODS: Data integration of NDR and Survey was performed using the multivariate imputation by chained equations (MICE) algorithm. RESULTS: The results were compared by sex and type of job because characteristics of target variables for imputation depend on these variables. There was a difference between the observed and pooled mean for the frequency of interventional therapy for nurses due to different type of medical facility distribution between observed and completed data. Concerning the marital status of males and females, and status of pregnancy for females, there was a difference between observed and pooled mean because the distribution of the year of birth was different between the observed and completed data. For lifetime status of smoking, the percentage of smoking experience was higher in the completed data than in the observed data, which could be due to reasons, such as underreporting among females and the distribution difference in the frequency of drinking between the observed and completed data for males. CONCLUSION: Data integration can allow us to obtain survey information of NDR units without additional surveys, saving us time and costs for the survey. Public Library of Science 2022-06-15 /pmc/articles/PMC9200363/ /pubmed/35704606 http://dx.doi.org/10.1371/journal.pone.0261534 Text en © 2022 Kim et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kim, Ryu Kyung
Kim, Young Min
Lee, Won Jin
Im, Jongho
Lee, Juhee
Bang, Ye Jin
Cha, Eun Shil
Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title_full Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title_fullStr Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title_full_unstemmed Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title_short Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
title_sort data integration of national dose registry and survey data using multivariate imputation by chained equations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200363/
https://www.ncbi.nlm.nih.gov/pubmed/35704606
http://dx.doi.org/10.1371/journal.pone.0261534
work_keys_str_mv AT kimryukyung dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT kimyoungmin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT leewonjin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT imjongho dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT leejuhee dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT bangyejin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations
AT chaeunshil dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations