Cargando…

Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine

OBJECTIVE: This study attempts to build a big data platform for disease burden that can realize the deep coupling of artificial intelligence and public health. This is a highly open and shared intelligent platform, including big data collection, analysis, and result visualization. METHODS: Based on...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Chengcheng, Gao, Jing, Pan, Qingwei, Zhou, Zhihua, Yang, Yue, Zhou, Shangcheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9925246/
https://www.ncbi.nlm.nih.gov/pubmed/36793705
http://dx.doi.org/10.1155/2023/8963053
_version_ 1784888023538728960
author Li, Chengcheng
Gao, Jing
Pan, Qingwei
Zhou, Zhihua
Yang, Yue
Zhou, Shangcheng
author_facet Li, Chengcheng
Gao, Jing
Pan, Qingwei
Zhou, Zhihua
Yang, Yue
Zhou, Shangcheng
author_sort Li, Chengcheng
collection PubMed
description OBJECTIVE: This study attempts to build a big data platform for disease burden that can realize the deep coupling of artificial intelligence and public health. This is a highly open and shared intelligent platform, including big data collection, analysis, and result visualization. METHODS: Based on data mining theory and technology, the current situation of multisource data on disease burden was analyzed. Putting forward the disease burden big data management model, functional modules, and technical framework, Kafka technology is used to optimize the transmission efficiency of the underlying data. This will be an efficient and highly scalable data analysis platform through embedding embedded Sparkmlib in the Hadoop ecosystem. RESULTS: With the concept of “Internet + medical integration,” the overall architecture design of the big data platform for disease burden management was proposed based on the Spark engine and Python language. The main system composition and application scenarios are given at four levels: multisource data collection, data processing, data analysis, and the application layer, according to application scenarios and use requirements. CONCLUSION: The big data platform of disease burden management helps to promote the multisource convergence of disease burden data and provides a new path for the standardized paradigm of disease burden measurement. Provide methods and ideas for the deep integration of medical big data and the formation of a broader standard paradigm.
format Online
Article
Text
id pubmed-9925246
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-99252462023-02-14 Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine Li, Chengcheng Gao, Jing Pan, Qingwei Zhou, Zhihua Yang, Yue Zhou, Shangcheng Comput Intell Neurosci Research Article OBJECTIVE: This study attempts to build a big data platform for disease burden that can realize the deep coupling of artificial intelligence and public health. This is a highly open and shared intelligent platform, including big data collection, analysis, and result visualization. METHODS: Based on data mining theory and technology, the current situation of multisource data on disease burden was analyzed. Putting forward the disease burden big data management model, functional modules, and technical framework, Kafka technology is used to optimize the transmission efficiency of the underlying data. This will be an efficient and highly scalable data analysis platform through embedding embedded Sparkmlib in the Hadoop ecosystem. RESULTS: With the concept of “Internet + medical integration,” the overall architecture design of the big data platform for disease burden management was proposed based on the Spark engine and Python language. The main system composition and application scenarios are given at four levels: multisource data collection, data processing, data analysis, and the application layer, according to application scenarios and use requirements. CONCLUSION: The big data platform of disease burden management helps to promote the multisource convergence of disease burden data and provides a new path for the standardized paradigm of disease burden measurement. Provide methods and ideas for the deep integration of medical big data and the formation of a broader standard paradigm. Hindawi 2023-02-06 /pmc/articles/PMC9925246/ /pubmed/36793705 http://dx.doi.org/10.1155/2023/8963053 Text en Copyright © 2023 Chengcheng Li et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Li, Chengcheng
Gao, Jing
Pan, Qingwei
Zhou, Zhihua
Yang, Yue
Zhou, Shangcheng
Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title_full Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title_fullStr Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title_full_unstemmed Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title_short Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine
title_sort design and development of a big data platform for disease burden based on the spark engine
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9925246/
https://www.ncbi.nlm.nih.gov/pubmed/36793705
http://dx.doi.org/10.1155/2023/8963053
work_keys_str_mv AT lichengcheng designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine
AT gaojing designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine
AT panqingwei designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine
AT zhouzhihua designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine
AT yangyue designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine
AT zhoushangcheng designanddevelopmentofabigdataplatformfordiseaseburdenbasedonthesparkengine