Cargando…

Combined spectral and speech features for pig speech recognition

The sound of the pig is one of its important signs, which can reflect various states such as hunger, pain or emotional state, and directly indicates the growth and health status of the pig. Existing speech recognition methods usually start with spectral features. The use of spectrograms to achieve c...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Xuan, Zhou, Silong, Chen, Mingwei, Zhao, Yihang, Wang, Yifei, Zhao, Xianmeng, Li, Danyang, Pu, Haibo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9714723/
https://www.ncbi.nlm.nih.gov/pubmed/36454724
http://dx.doi.org/10.1371/journal.pone.0276778
_version_ 1784842290570723328
author Wu, Xuan
Zhou, Silong
Chen, Mingwei
Zhao, Yihang
Wang, Yifei
Zhao, Xianmeng
Li, Danyang
Pu, Haibo
author_facet Wu, Xuan
Zhou, Silong
Chen, Mingwei
Zhao, Yihang
Wang, Yifei
Zhao, Xianmeng
Li, Danyang
Pu, Haibo
author_sort Wu, Xuan
collection PubMed
description The sound of the pig is one of its important signs, which can reflect various states such as hunger, pain or emotional state, and directly indicates the growth and health status of the pig. Existing speech recognition methods usually start with spectral features. The use of spectrograms to achieve classification of different speech sounds, while working well, may not be the best approach for solving such tasks with single-dimensional feature input. Based on the above assumptions, in order to more accurately grasp the situation of pigs and take timely measures to ensure the health status of pigs, this paper proposes a pig sound classification method based on the dual role of signal spectrum and speech. Spectrograms can visualize information about the characteristics of the sound under different time periods. The audio data are introduced, and the spectrogram features of the model input as well as the audio time-domain features are complemented with each other and passed into a pre-designed parallel network structure. The network model with the best results and the classifier were selected for combination. An accuracy of 93.39% was achieved on the pig speech classification task, while the AUC also reached 0.99163, demonstrating the superiority of the method. This study contributes to the direction of computer vision and acoustics by recognizing the sound of pigs. In addition, a total of 4,000 pig sound datasets in four categories are established in this paper to provide a research basis for later research scholars.
format Online
Article
Text
id pubmed-9714723
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-97147232022-12-02 Combined spectral and speech features for pig speech recognition Wu, Xuan Zhou, Silong Chen, Mingwei Zhao, Yihang Wang, Yifei Zhao, Xianmeng Li, Danyang Pu, Haibo PLoS One Research Article The sound of the pig is one of its important signs, which can reflect various states such as hunger, pain or emotional state, and directly indicates the growth and health status of the pig. Existing speech recognition methods usually start with spectral features. The use of spectrograms to achieve classification of different speech sounds, while working well, may not be the best approach for solving such tasks with single-dimensional feature input. Based on the above assumptions, in order to more accurately grasp the situation of pigs and take timely measures to ensure the health status of pigs, this paper proposes a pig sound classification method based on the dual role of signal spectrum and speech. Spectrograms can visualize information about the characteristics of the sound under different time periods. The audio data are introduced, and the spectrogram features of the model input as well as the audio time-domain features are complemented with each other and passed into a pre-designed parallel network structure. The network model with the best results and the classifier were selected for combination. An accuracy of 93.39% was achieved on the pig speech classification task, while the AUC also reached 0.99163, demonstrating the superiority of the method. This study contributes to the direction of computer vision and acoustics by recognizing the sound of pigs. In addition, a total of 4,000 pig sound datasets in four categories are established in this paper to provide a research basis for later research scholars. Public Library of Science 2022-12-01 /pmc/articles/PMC9714723/ /pubmed/36454724 http://dx.doi.org/10.1371/journal.pone.0276778 Text en © 2022 Wu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wu, Xuan
Zhou, Silong
Chen, Mingwei
Zhao, Yihang
Wang, Yifei
Zhao, Xianmeng
Li, Danyang
Pu, Haibo
Combined spectral and speech features for pig speech recognition
title Combined spectral and speech features for pig speech recognition
title_full Combined spectral and speech features for pig speech recognition
title_fullStr Combined spectral and speech features for pig speech recognition
title_full_unstemmed Combined spectral and speech features for pig speech recognition
title_short Combined spectral and speech features for pig speech recognition
title_sort combined spectral and speech features for pig speech recognition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9714723/
https://www.ncbi.nlm.nih.gov/pubmed/36454724
http://dx.doi.org/10.1371/journal.pone.0276778
work_keys_str_mv AT wuxuan combinedspectralandspeechfeaturesforpigspeechrecognition
AT zhousilong combinedspectralandspeechfeaturesforpigspeechrecognition
AT chenmingwei combinedspectralandspeechfeaturesforpigspeechrecognition
AT zhaoyihang combinedspectralandspeechfeaturesforpigspeechrecognition
AT wangyifei combinedspectralandspeechfeaturesforpigspeechrecognition
AT zhaoxianmeng combinedspectralandspeechfeaturesforpigspeechrecognition
AT lidanyang combinedspectralandspeechfeaturesforpigspeechrecognition
AT puhaibo combinedspectralandspeechfeaturesforpigspeechrecognition