Cargando…

Sampling via the aggregation value for data-driven manufacturing

Data-driven modelling has shown promising potential in many industrial applications, while the expensive and time-consuming labelling of experimental and simulation data restricts its further development. Preparing a more informative but smaller dataset to reduce labelling efforts has been a vital r...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xu, Chen, Gengxiang, Li, Yingguang, Chen, Lu, Meng, Qinglu, Mehdi-Souzani, Charyar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646999/
https://www.ncbi.nlm.nih.gov/pubmed/36381214
http://dx.doi.org/10.1093/nsr/nwac201
_version_ 1784827287115399168
author Liu, Xu
Chen, Gengxiang
Li, Yingguang
Chen, Lu
Meng, Qinglu
Mehdi-Souzani, Charyar
author_facet Liu, Xu
Chen, Gengxiang
Li, Yingguang
Chen, Lu
Meng, Qinglu
Mehdi-Souzani, Charyar
author_sort Liu, Xu
collection PubMed
description Data-driven modelling has shown promising potential in many industrial applications, while the expensive and time-consuming labelling of experimental and simulation data restricts its further development. Preparing a more informative but smaller dataset to reduce labelling efforts has been a vital research problem. Although existing techniques can assess the value of individual data samples, how to represent the value of a sample set remains an open problem. In this research, the aggregation value is defined using a novel representation for the value of a sample set by modelling the invisible redundant information as the overlaps of neighbouring values. The sampling problem is hence converted to the maximisation of the submodular function over the aggregation value. The comprehensive analysis of several manufacturing datasets demonstrates that the proposed method can provide sample sets with superior and stable performance compared with state-of-the-art methods. The research outcome also indicates its appealing potential to reduce labelling efforts for more data-scarcity scenarios.
format Online
Article
Text
id pubmed-9646999
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96469992022-11-14 Sampling via the aggregation value for data-driven manufacturing Liu, Xu Chen, Gengxiang Li, Yingguang Chen, Lu Meng, Qinglu Mehdi-Souzani, Charyar Natl Sci Rev Research Article Data-driven modelling has shown promising potential in many industrial applications, while the expensive and time-consuming labelling of experimental and simulation data restricts its further development. Preparing a more informative but smaller dataset to reduce labelling efforts has been a vital research problem. Although existing techniques can assess the value of individual data samples, how to represent the value of a sample set remains an open problem. In this research, the aggregation value is defined using a novel representation for the value of a sample set by modelling the invisible redundant information as the overlaps of neighbouring values. The sampling problem is hence converted to the maximisation of the submodular function over the aggregation value. The comprehensive analysis of several manufacturing datasets demonstrates that the proposed method can provide sample sets with superior and stable performance compared with state-of-the-art methods. The research outcome also indicates its appealing potential to reduce labelling efforts for more data-scarcity scenarios. Oxford University Press 2022-09-24 /pmc/articles/PMC9646999/ /pubmed/36381214 http://dx.doi.org/10.1093/nsr/nwac201 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Xu
Chen, Gengxiang
Li, Yingguang
Chen, Lu
Meng, Qinglu
Mehdi-Souzani, Charyar
Sampling via the aggregation value for data-driven manufacturing
title Sampling via the aggregation value for data-driven manufacturing
title_full Sampling via the aggregation value for data-driven manufacturing
title_fullStr Sampling via the aggregation value for data-driven manufacturing
title_full_unstemmed Sampling via the aggregation value for data-driven manufacturing
title_short Sampling via the aggregation value for data-driven manufacturing
title_sort sampling via the aggregation value for data-driven manufacturing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646999/
https://www.ncbi.nlm.nih.gov/pubmed/36381214
http://dx.doi.org/10.1093/nsr/nwac201
work_keys_str_mv AT liuxu samplingviatheaggregationvaluefordatadrivenmanufacturing
AT chengengxiang samplingviatheaggregationvaluefordatadrivenmanufacturing
AT liyingguang samplingviatheaggregationvaluefordatadrivenmanufacturing
AT chenlu samplingviatheaggregationvaluefordatadrivenmanufacturing
AT mengqinglu samplingviatheaggregationvaluefordatadrivenmanufacturing
AT mehdisouzanicharyar samplingviatheaggregationvaluefordatadrivenmanufacturing