Cargando…
Data integration of National Dose Registry and survey data using multivariate imputation by chained equations
INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 201...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200363/ https://www.ncbi.nlm.nih.gov/pubmed/35704606 http://dx.doi.org/10.1371/journal.pone.0261534 |
_version_ | 1784728045622394880 |
---|---|
author | Kim, Ryu Kyung Kim, Young Min Lee, Won Jin Im, Jongho Lee, Juhee Bang, Ye Jin Cha, Eun Shil |
author_facet | Kim, Ryu Kyung Kim, Young Min Lee, Won Jin Im, Jongho Lee, Juhee Bang, Ye Jin Cha, Eun Shil |
author_sort | Kim, Ryu Kyung |
collection | PubMed |
description | INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 2011, linked with mortality and cancer registry data. (https://kdca.go.kr/) Survey was conducted during 2012-2013 using self-reported questionnaire on occupational radiation practices among diagnostic medical radiation workers. METHODS: Data integration of NDR and Survey was performed using the multivariate imputation by chained equations (MICE) algorithm. RESULTS: The results were compared by sex and type of job because characteristics of target variables for imputation depend on these variables. There was a difference between the observed and pooled mean for the frequency of interventional therapy for nurses due to different type of medical facility distribution between observed and completed data. Concerning the marital status of males and females, and status of pregnancy for females, there was a difference between observed and pooled mean because the distribution of the year of birth was different between the observed and completed data. For lifetime status of smoking, the percentage of smoking experience was higher in the completed data than in the observed data, which could be due to reasons, such as underreporting among females and the distribution difference in the frequency of drinking between the observed and completed data for males. CONCLUSION: Data integration can allow us to obtain survey information of NDR units without additional surveys, saving us time and costs for the survey. |
format | Online Article Text |
id | pubmed-9200363 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-92003632022-06-16 Data integration of National Dose Registry and survey data using multivariate imputation by chained equations Kim, Ryu Kyung Kim, Young Min Lee, Won Jin Im, Jongho Lee, Juhee Bang, Ye Jin Cha, Eun Shil PLoS One Research Article INTRODUCTION: Data integration is the process of merging information from multiple datasets generated from different sources, which can obtain more information in comparison to to one data source. All diagnostic medical radiation workers were enrolled in National Dose Registry (NDR) from 1996 to 2011, linked with mortality and cancer registry data. (https://kdca.go.kr/) Survey was conducted during 2012-2013 using self-reported questionnaire on occupational radiation practices among diagnostic medical radiation workers. METHODS: Data integration of NDR and Survey was performed using the multivariate imputation by chained equations (MICE) algorithm. RESULTS: The results were compared by sex and type of job because characteristics of target variables for imputation depend on these variables. There was a difference between the observed and pooled mean for the frequency of interventional therapy for nurses due to different type of medical facility distribution between observed and completed data. Concerning the marital status of males and females, and status of pregnancy for females, there was a difference between observed and pooled mean because the distribution of the year of birth was different between the observed and completed data. For lifetime status of smoking, the percentage of smoking experience was higher in the completed data than in the observed data, which could be due to reasons, such as underreporting among females and the distribution difference in the frequency of drinking between the observed and completed data for males. CONCLUSION: Data integration can allow us to obtain survey information of NDR units without additional surveys, saving us time and costs for the survey. Public Library of Science 2022-06-15 /pmc/articles/PMC9200363/ /pubmed/35704606 http://dx.doi.org/10.1371/journal.pone.0261534 Text en © 2022 Kim et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Kim, Ryu Kyung Kim, Young Min Lee, Won Jin Im, Jongho Lee, Juhee Bang, Ye Jin Cha, Eun Shil Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title | Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title_full | Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title_fullStr | Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title_full_unstemmed | Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title_short | Data integration of National Dose Registry and survey data using multivariate imputation by chained equations |
title_sort | data integration of national dose registry and survey data using multivariate imputation by chained equations |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200363/ https://www.ncbi.nlm.nih.gov/pubmed/35704606 http://dx.doi.org/10.1371/journal.pone.0261534 |
work_keys_str_mv | AT kimryukyung dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT kimyoungmin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT leewonjin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT imjongho dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT leejuhee dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT bangyejin dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations AT chaeunshil dataintegrationofnationaldoseregistryandsurveydatausingmultivariateimputationbychainedequations |