Cargando…

Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging

In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to...

Descripción completa

Detalles Bibliográficos
Autores principales: Akama, Taketo, Kitano, Hiroaki, Takematsu, Katsuhiro, Miyajima, Yasushi, Polouliakh, Natalia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688627/
https://www.ncbi.nlm.nih.gov/pubmed/38032868
http://dx.doi.org/10.1371/journal.pone.0294643
_version_ 1785152201174286336
author Akama, Taketo
Kitano, Hiroaki
Takematsu, Katsuhiro
Miyajima, Yasushi
Polouliakh, Natalia
author_facet Akama, Taketo
Kitano, Hiroaki
Takematsu, Katsuhiro
Miyajima, Yasushi
Polouliakh, Natalia
author_sort Akama, Taketo
collection PubMed
description In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks.
format Online
Article
Text
id pubmed-10688627
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-106886272023-12-01 Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging Akama, Taketo Kitano, Hiroaki Takematsu, Katsuhiro Miyajima, Yasushi Polouliakh, Natalia PLoS One Research Article In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks. Public Library of Science 2023-11-30 /pmc/articles/PMC10688627/ /pubmed/38032868 http://dx.doi.org/10.1371/journal.pone.0294643 Text en © 2023 Akama et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Akama, Taketo
Kitano, Hiroaki
Takematsu, Katsuhiro
Miyajima, Yasushi
Polouliakh, Natalia
Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title_full Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title_fullStr Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title_full_unstemmed Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title_short Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
title_sort auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688627/
https://www.ncbi.nlm.nih.gov/pubmed/38032868
http://dx.doi.org/10.1371/journal.pone.0294643
work_keys_str_mv AT akamataketo auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT kitanohiroaki auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT takematsukatsuhiro auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT miyajimayasushi auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging
AT polouliakhnatalia auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging