Cargando…
CricShotClassify: An Approach to Classifying Batting Shots from Cricket Videos Using a Convolutional Neural Network and Gated Recurrent Unit
Recognizing the sport of cricket on the basis of different batting shots can be a significant part of context-based advertisement to users watching cricket, generating sensor-based commentary systems and coaching assistants. Due to the similarity between different batting shots, manual feature extra...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8072636/ https://www.ncbi.nlm.nih.gov/pubmed/33919484 http://dx.doi.org/10.3390/s21082846 |
Sumario: | Recognizing the sport of cricket on the basis of different batting shots can be a significant part of context-based advertisement to users watching cricket, generating sensor-based commentary systems and coaching assistants. Due to the similarity between different batting shots, manual feature extraction from video frames is tedious. This paper proposes a hybrid deep-neural-network architecture for classifying 10 different cricket batting shots from offline videos. We composed a novel dataset, CricShot10, comprising uneven lengths of batting shots and unpredictable illumination conditions. Impelled by the enormous success of deep-learning models, we utilized a convolutional neural network (CNN) for automatic feature extraction, and a gated recurrent unit (GRU) to deal with long temporal dependency. Initially, conventional CNN and dilated CNN-based architectures were developed. Following that, different transfer-learning models were investigated—namely, VGG16, InceptionV3, Xception, and DenseNet169—which freeze all the layers. Experiment results demonstrated that the VGG16–GRU model outperformed the other models by attaining 86% accuracy. We further explored VGG16 and two models were developed, one by freezing all but the final 4 VGG16 layers, and another by freezing all but the final 8 VGG16 layers. On our CricShot10 dataset, these two models were 93% accurate. These results verify the effectiveness of our proposed architecture compared with other methods in terms of accuracy. |
---|