Cargando…
Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach
Today, cloud systems provide many key services to development and production environments; reliable storage services are crucial for a multitude of applications ranging from commercial manufacturing, distribution and sales up to scientific research, which is often at the forefront of computing resou...
Autores principales: | , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Acceso en línea: | https://dx.doi.org/10.3390/app11188293 http://cds.cern.ch/record/2783635 |
_version_ | 1780972063389908992 |
---|---|
author | Gargiulo, Federico Duellmann, Dirk Arpaia, Pasquale Lo Moriello, Rosario Schiano |
author_facet | Gargiulo, Federico Duellmann, Dirk Arpaia, Pasquale Lo Moriello, Rosario Schiano |
author_sort | Gargiulo, Federico |
collection | CERN |
description | Today, cloud systems provide many key services to development and production environments; reliable storage services are crucial for a multitude of applications ranging from commercial manufacturing, distribution and sales up to scientific research, which is often at the forefront of computing resource demands. In large-scale computer centers, the storage system requires particular attention and investment; usually, a large number of diverse storage devices need to be deployed in order to match the varying performance and volume requirements of changing user applications. As of today, magnetic drives still play a dominant role in terms of deployed storage volume and of service outages due to device failure. In this paper, we study methods to facilitate automated proactive disk replacement. We propose a method to identify disks with media failures in a production environment and describe an application of supervised machine learning to predict disk failures. In particular, a proper stage to automatically label (healthy/at-risk) the disks during the training and validation stage is presented along with tuning strategy to optimize the hyperparameters of the associated machine learning classifier. The approach is trained and validated against a large set of 65,000 hard drives in the CERN computer center, and the achieved results are discussed. |
id | cern-2783635 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-27836352021-10-09T20:37:20Zdoi:10.3390/app11188293http://cds.cern.ch/record/2783635engGargiulo, FedericoDuellmann, DirkArpaia, PasqualeLo Moriello, Rosario SchianoPredicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning ApproachToday, cloud systems provide many key services to development and production environments; reliable storage services are crucial for a multitude of applications ranging from commercial manufacturing, distribution and sales up to scientific research, which is often at the forefront of computing resource demands. In large-scale computer centers, the storage system requires particular attention and investment; usually, a large number of diverse storage devices need to be deployed in order to match the varying performance and volume requirements of changing user applications. As of today, magnetic drives still play a dominant role in terms of deployed storage volume and of service outages due to device failure. In this paper, we study methods to facilitate automated proactive disk replacement. We propose a method to identify disks with media failures in a production environment and describe an application of supervised machine learning to predict disk failures. In particular, a proper stage to automatically label (healthy/at-risk) the disks during the training and validation stage is presented along with tuning strategy to optimize the hyperparameters of the associated machine learning classifier. The approach is trained and validated against a large set of 65,000 hard drives in the CERN computer center, and the achieved results are discussed.oai:cds.cern.ch:27836352021 |
spellingShingle | Gargiulo, Federico Duellmann, Dirk Arpaia, Pasquale Lo Moriello, Rosario Schiano Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title | Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title_full | Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title_fullStr | Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title_full_unstemmed | Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title_short | Predicting Hard Disk Failure by Means of Automatized Labeling and Machine Learning Approach |
title_sort | predicting hard disk failure by means of automatized labeling and machine learning approach |
url | https://dx.doi.org/10.3390/app11188293 http://cds.cern.ch/record/2783635 |
work_keys_str_mv | AT gargiulofederico predictingharddiskfailurebymeansofautomatizedlabelingandmachinelearningapproach AT duellmanndirk predictingharddiskfailurebymeansofautomatizedlabelingandmachinelearningapproach AT arpaiapasquale predictingharddiskfailurebymeansofautomatizedlabelingandmachinelearningapproach AT lomoriellorosarioschiano predictingharddiskfailurebymeansofautomatizedlabelingandmachinelearningapproach |