Cargando…
Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging
In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688627/ https://www.ncbi.nlm.nih.gov/pubmed/38032868 http://dx.doi.org/10.1371/journal.pone.0294643 |
_version_ | 1785152201174286336 |
---|---|
author | Akama, Taketo Kitano, Hiroaki Takematsu, Katsuhiro Miyajima, Yasushi Polouliakh, Natalia |
author_facet | Akama, Taketo Kitano, Hiroaki Takematsu, Katsuhiro Miyajima, Yasushi Polouliakh, Natalia |
author_sort | Akama, Taketo |
collection | PubMed |
description | In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks. |
format | Online Article Text |
id | pubmed-10688627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-106886272023-12-01 Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging Akama, Taketo Kitano, Hiroaki Takematsu, Katsuhiro Miyajima, Yasushi Polouliakh, Natalia PLoS One Research Article In the realm of music information retrieval, similarity-based retrieval and auto-tagging serve as essential components. Similarity-based retrieval involves automatically analyzing a music track and fetching analogous tracks from a database. Auto-tagging, on the other hand, assesses a music track to deduce associated tags, such as genre and mood. Given the limitations and non-scalability of human supervision signals, it becomes crucial for models to learn from alternative sources to enhance their performance. Contrastive learning-based self-supervised learning, which exclusively relies on learning signals derived from music audio data, has demonstrated its efficacy in the context of auto-tagging. In this work, we propose a model that builds on the self-supervised learning approach to address the similarity-based retrieval challenge by introducing our method of metric learning with a self-supervised auxiliary loss. Furthermore, diverging from conventional self-supervised learning methodologies, we discovered the advantages of concurrently training the model with both self-supervision and supervision signals, without freezing pre-trained models. We also found that refraining from employing augmentation during the fine-tuning phase yields better results. Our experimental results confirm that the proposed methodology enhances retrieval and tagging performance metrics in two distinct scenarios: one where human-annotated tags are consistently available for all music tracks, and another where such tags are accessible only for a subset of music tracks. Public Library of Science 2023-11-30 /pmc/articles/PMC10688627/ /pubmed/38032868 http://dx.doi.org/10.1371/journal.pone.0294643 Text en © 2023 Akama et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Akama, Taketo Kitano, Hiroaki Takematsu, Katsuhiro Miyajima, Yasushi Polouliakh, Natalia Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title | Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title_full | Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title_fullStr | Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title_full_unstemmed | Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title_short | Auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
title_sort | auxiliary self-supervision to metric learning for music similarity-based retrieval and auto-tagging |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688627/ https://www.ncbi.nlm.nih.gov/pubmed/38032868 http://dx.doi.org/10.1371/journal.pone.0294643 |
work_keys_str_mv | AT akamataketo auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging AT kitanohiroaki auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging AT takematsukatsuhiro auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging AT miyajimayasushi auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging AT polouliakhnatalia auxiliaryselfsupervisiontometriclearningformusicsimilaritybasedretrievalandautotagging |