Cargando…

Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytic...

Descripción completa

Detalles Bibliográficos
Autores principales: Tripathi, Shailesh, Muhr, David, Brunner, Manuel, Jodlbauer, Herbert, Dehmer, Matthias, Emmert-Streib, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236533/
https://www.ncbi.nlm.nih.gov/pubmed/34195608
http://dx.doi.org/10.3389/frai.2021.576892
_version_ 1783714557508714496
author Tripathi, Shailesh
Muhr, David
Brunner, Manuel
Jodlbauer, Herbert
Dehmer, Matthias
Emmert-Streib, Frank
author_facet Tripathi, Shailesh
Muhr, David
Brunner, Manuel
Jodlbauer, Herbert
Dehmer, Matthias
Emmert-Streib, Frank
author_sort Tripathi, Shailesh
collection PubMed
description The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.
format Online
Article
Text
id pubmed-8236533
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82365332021-06-29 Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing Tripathi, Shailesh Muhr, David Brunner, Manuel Jodlbauer, Herbert Dehmer, Matthias Emmert-Streib, Frank Front Artif Intell Artificial Intelligence The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues. Frontiers Media S.A. 2021-06-14 /pmc/articles/PMC8236533/ /pubmed/34195608 http://dx.doi.org/10.3389/frai.2021.576892 Text en Copyright © 2021 Tripathi, Muhr, Brunner, Jodlbauer, Dehmer and Emmert-Streib. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Tripathi, Shailesh
Muhr, David
Brunner, Manuel
Jodlbauer, Herbert
Dehmer, Matthias
Emmert-Streib, Frank
Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title_full Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title_fullStr Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title_full_unstemmed Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title_short Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
title_sort ensuring the robustness and reliability of data-driven knowledge discovery models in production and manufacturing
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236533/
https://www.ncbi.nlm.nih.gov/pubmed/34195608
http://dx.doi.org/10.3389/frai.2021.576892
work_keys_str_mv AT tripathishailesh ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing
AT muhrdavid ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing
AT brunnermanuel ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing
AT jodlbauerherbert ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing
AT dehmermatthias ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing
AT emmertstreibfrank ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing