Cargando…
Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing
The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytic...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236533/ https://www.ncbi.nlm.nih.gov/pubmed/34195608 http://dx.doi.org/10.3389/frai.2021.576892 |
_version_ | 1783714557508714496 |
---|---|
author | Tripathi, Shailesh Muhr, David Brunner, Manuel Jodlbauer, Herbert Dehmer, Matthias Emmert-Streib, Frank |
author_facet | Tripathi, Shailesh Muhr, David Brunner, Manuel Jodlbauer, Herbert Dehmer, Matthias Emmert-Streib, Frank |
author_sort | Tripathi, Shailesh |
collection | PubMed |
description | The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues. |
format | Online Article Text |
id | pubmed-8236533 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82365332021-06-29 Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing Tripathi, Shailesh Muhr, David Brunner, Manuel Jodlbauer, Herbert Dehmer, Matthias Emmert-Streib, Frank Front Artif Intell Artificial Intelligence The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues. Frontiers Media S.A. 2021-06-14 /pmc/articles/PMC8236533/ /pubmed/34195608 http://dx.doi.org/10.3389/frai.2021.576892 Text en Copyright © 2021 Tripathi, Muhr, Brunner, Jodlbauer, Dehmer and Emmert-Streib. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Tripathi, Shailesh Muhr, David Brunner, Manuel Jodlbauer, Herbert Dehmer, Matthias Emmert-Streib, Frank Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title | Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title_full | Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title_fullStr | Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title_full_unstemmed | Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title_short | Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing |
title_sort | ensuring the robustness and reliability of data-driven knowledge discovery models in production and manufacturing |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236533/ https://www.ncbi.nlm.nih.gov/pubmed/34195608 http://dx.doi.org/10.3389/frai.2021.576892 |
work_keys_str_mv | AT tripathishailesh ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing AT muhrdavid ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing AT brunnermanuel ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing AT jodlbauerherbert ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing AT dehmermatthias ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing AT emmertstreibfrank ensuringtherobustnessandreliabilityofdatadrivenknowledgediscoverymodelsinproductionandmanufacturing |