Cargando…

Extended performance analysis of deep-learning algorithms for mice vocalization segmentation

Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive...

Descripción completa

Detalles Bibliográficos
Autores principales: Baggi, Daniele, Premoli, Marika, Gnutti, Alessandro, Bonini, Sara Anna, Leonardi, Riccardo, Memo, Maurizio, Migliorati, Pierangelo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336146/
https://www.ncbi.nlm.nih.gov/pubmed/37433808
http://dx.doi.org/10.1038/s41598-023-38186-7
_version_ 1785071147808718848
author Baggi, Daniele
Premoli, Marika
Gnutti, Alessandro
Bonini, Sara Anna
Leonardi, Riccardo
Memo, Maurizio
Migliorati, Pierangelo
author_facet Baggi, Daniele
Premoli, Marika
Gnutti, Alessandro
Bonini, Sara Anna
Leonardi, Riccardo
Memo, Maurizio
Migliorati, Pierangelo
author_sort Baggi, Daniele
collection PubMed
description Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding [Formula: see text] , with UNET and AE achieving values above [Formula: see text] , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works.
format Online
Article
Text
id pubmed-10336146
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103361462023-07-13 Extended performance analysis of deep-learning algorithms for mice vocalization segmentation Baggi, Daniele Premoli, Marika Gnutti, Alessandro Bonini, Sara Anna Leonardi, Riccardo Memo, Maurizio Migliorati, Pierangelo Sci Rep Article Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding [Formula: see text] , with UNET and AE achieving values above [Formula: see text] , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works. Nature Publishing Group UK 2023-07-11 /pmc/articles/PMC10336146/ /pubmed/37433808 http://dx.doi.org/10.1038/s41598-023-38186-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Baggi, Daniele
Premoli, Marika
Gnutti, Alessandro
Bonini, Sara Anna
Leonardi, Riccardo
Memo, Maurizio
Migliorati, Pierangelo
Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_full Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_fullStr Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_full_unstemmed Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_short Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_sort extended performance analysis of deep-learning algorithms for mice vocalization segmentation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336146/
https://www.ncbi.nlm.nih.gov/pubmed/37433808
http://dx.doi.org/10.1038/s41598-023-38186-7
work_keys_str_mv AT baggidaniele extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT premolimarika extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT gnuttialessandro extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT boninisaraanna extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT leonardiriccardo extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT memomaurizio extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation
AT miglioratipierangelo extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation