Cargando…

Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment

The large amount of programmable logic controller (PLC) sensing data has rapidly increased in the manufacturing environment. Therefore, a large data store is necessary for Big Data platforms. In this paper, we propose a Hadoop ecosystem for the support of many features in the manufacturing industry....

Descripción completa

Detalles Bibliográficos
Autores principales:	Leang, Bunrong, Ean, Sokchomrern, Ryu, Ga-Ae, Yoo, Kwan-Hee
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6338896/ https://www.ncbi.nlm.nih.gov/pubmed/30609759 http://dx.doi.org/10.3390/s19010134

_version_	1783388510835703808
author	Leang, Bunrong Ean, Sokchomrern Ryu, Ga-Ae Yoo, Kwan-Hee
author_facet	Leang, Bunrong Ean, Sokchomrern Ryu, Ga-Ae Yoo, Kwan-Hee
author_sort	Leang, Bunrong
collection	PubMed
description	The large amount of programmable logic controller (PLC) sensing data has rapidly increased in the manufacturing environment. Therefore, a large data store is necessary for Big Data platforms. In this paper, we propose a Hadoop ecosystem for the support of many features in the manufacturing industry. In this ecosystem, Apache Hadoop and HBase are used as Big Data storage and handle large scale data. In addition, Apache Kafka is used as a data streaming pipeline which contains many configurations and properties that are used to make a better-designed environment and a reliable system, such as Kafka offset and partition, which is used for program scaling purposes. Moreover, Apache Spark closely works with Kafka consumers to create a real-time processing and analysis of the data. Meanwhile, data security is applied in the data transmission phase between the Kafka producers and consumers. Public-key cryptography is performed as a security method which contains public and private keys. Additionally, the public-key is located in the Kafka producer, and the private-key is stored in the Kafka consumer. The integration of these above technologies will enhance the performance and accuracy of data storing, processing, and securing in the manufacturing environment.
format	Online Article Text
id	pubmed-6338896
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-63388962019-01-23 Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment Leang, Bunrong Ean, Sokchomrern Ryu, Ga-Ae Yoo, Kwan-Hee Sensors (Basel) Article The large amount of programmable logic controller (PLC) sensing data has rapidly increased in the manufacturing environment. Therefore, a large data store is necessary for Big Data platforms. In this paper, we propose a Hadoop ecosystem for the support of many features in the manufacturing industry. In this ecosystem, Apache Hadoop and HBase are used as Big Data storage and handle large scale data. In addition, Apache Kafka is used as a data streaming pipeline which contains many configurations and properties that are used to make a better-designed environment and a reliable system, such as Kafka offset and partition, which is used for program scaling purposes. Moreover, Apache Spark closely works with Kafka consumers to create a real-time processing and analysis of the data. Meanwhile, data security is applied in the data transmission phase between the Kafka producers and consumers. Public-key cryptography is performed as a security method which contains public and private keys. Additionally, the public-key is located in the Kafka producer, and the private-key is stored in the Kafka consumer. The integration of these above technologies will enhance the performance and accuracy of data storing, processing, and securing in the manufacturing environment. MDPI 2019-01-02 /pmc/articles/PMC6338896/ /pubmed/30609759 http://dx.doi.org/10.3390/s19010134 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Leang, Bunrong Ean, Sokchomrern Ryu, Ga-Ae Yoo, Kwan-Hee Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title	Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title_full	Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title_fullStr	Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title_full_unstemmed	Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title_short	Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
title_sort	improvement of kafka streaming using partition and multi-threading in big data environment
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6338896/ https://www.ncbi.nlm.nih.gov/pubmed/30609759 http://dx.doi.org/10.3390/s19010134
work_keys_str_mv	AT leangbunrong improvementofkafkastreamingusingpartitionandmultithreadinginbigdataenvironment AT eansokchomrern improvementofkafkastreamingusingpartitionandmultithreadinginbigdataenvironment AT ryugaae improvementofkafkastreamingusingpartitionandmultithreadinginbigdataenvironment AT yookwanhee improvementofkafkastreamingusingpartitionandmultithreadinginbigdataenvironment

Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment

Ejemplares similares