Cargando…

A Multimodal Convolutional Neural Network Model for the Analysis of Music Genre on Children's Emotions Influence Intelligence

This paper designs a multimodal convolutional neural network model for the intelligent analysis of the influence of music genres on children's emotions by constructing a multimodal convolutional neural network model and profoundly analyzing the impact of music genres on children's feelings...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Wei, Wu, Guobin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9444378/
https://www.ncbi.nlm.nih.gov/pubmed/36072733
http://dx.doi.org/10.1155/2022/5611456
Descripción
Sumario:This paper designs a multimodal convolutional neural network model for the intelligent analysis of the influence of music genres on children's emotions by constructing a multimodal convolutional neural network model and profoundly analyzing the impact of music genres on children's feelings. Considering the diversity of music genre features in the audio power spectrogram, the Mel filtering method is used in the feature extraction stage to ensure the effective retention of the genre feature attributes of the audio signal by dimensional reduction of the Mel filtered signal, deepening the differences of the extracted features between different genres, and to reduce the input size and expand the model training scale in the model input stage, the audio power spectrogram obtained by feature extraction is cut the MSCN-LSTM consists of two modules: multiscale convolutional kernel convolutional neural network and long and short term memory network. The MSCNN network is used to extract the EEG signal features, the LSTM network is used to remove the temporal characteristics of the eye-movement signal, and the feature fusion is done by feature-level fusion. The multimodal signal has a higher emotion classification accuracy than the unimodal signal, and the average accuracy of emotion quadruple classification based on a 6-channel EEG signal, and children's multimodal signal reaches 97.94%. After pretraining with the MSD (Million Song Dataset) dataset in this paper, the model effect was further improved significantly. The accuracy of the Dense Inception network improved to 91.0% and 89.91% on the GTZAN dataset and ISMIR2004 dataset, respectively, proving that the Dense Inception network's effectiveness and advancedness of the Dense Inception network were demonstrated.