Cargando…
Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336146/ https://www.ncbi.nlm.nih.gov/pubmed/37433808 http://dx.doi.org/10.1038/s41598-023-38186-7 |
_version_ | 1785071147808718848 |
---|---|
author | Baggi, Daniele Premoli, Marika Gnutti, Alessandro Bonini, Sara Anna Leonardi, Riccardo Memo, Maurizio Migliorati, Pierangelo |
author_facet | Baggi, Daniele Premoli, Marika Gnutti, Alessandro Bonini, Sara Anna Leonardi, Riccardo Memo, Maurizio Migliorati, Pierangelo |
author_sort | Baggi, Daniele |
collection | PubMed |
description | Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding [Formula: see text] , with UNET and AE achieving values above [Formula: see text] , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works. |
format | Online Article Text |
id | pubmed-10336146 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-103361462023-07-13 Extended performance analysis of deep-learning algorithms for mice vocalization segmentation Baggi, Daniele Premoli, Marika Gnutti, Alessandro Bonini, Sara Anna Leonardi, Riccardo Memo, Maurizio Migliorati, Pierangelo Sci Rep Article Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding [Formula: see text] , with UNET and AE achieving values above [Formula: see text] , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works. Nature Publishing Group UK 2023-07-11 /pmc/articles/PMC10336146/ /pubmed/37433808 http://dx.doi.org/10.1038/s41598-023-38186-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Baggi, Daniele Premoli, Marika Gnutti, Alessandro Bonini, Sara Anna Leonardi, Riccardo Memo, Maurizio Migliorati, Pierangelo Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title | Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title_full | Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title_fullStr | Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title_full_unstemmed | Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title_short | Extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
title_sort | extended performance analysis of deep-learning algorithms for mice vocalization segmentation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336146/ https://www.ncbi.nlm.nih.gov/pubmed/37433808 http://dx.doi.org/10.1038/s41598-023-38186-7 |
work_keys_str_mv | AT baggidaniele extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT premolimarika extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT gnuttialessandro extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT boninisaraanna extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT leonardiriccardo extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT memomaurizio extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT miglioratipierangelo extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation |