[IPOL announce] new article: A Brief Analysis of SLAVC method for Sound Source Localization
announcements about the IPOL journal
announce at list.ipol.im
Wed May 29 19:28:36 CEST 2024
A new article is available in IPOL: https://www.ipol.im/pub/art/2024/525/
Xavier Juanola, and Gloria Haro,
A Brief Analysis of SLAVC method for Sound Source Localization,
Image Processing On Line, 14 (2024), pp. 159–172.
https://doi.org/10.5201/ipol.2024.525
Abstract
Mo and Morgado introduced in 2022 a novel self-supervised learning
approach for Visual Sound Source Localization, denoted as SLAVC [Mo, S.
and Mordado, P., A Closer Look at Weakly-Supervised Audio-Visual Source
Localization, Advances in Neural Information Processing Systems, 2022].
The proposed method is based on multiple-instance contrastive learning.
In addition to improving the results of previous methods, it also solves
two critical problems that former methods faced: 1) excessive
overfitting despite training on extensive datasets, 2) tendency to
hallucinate sound sources even without visual evidence to support it in
the video. In this paper, we briefly present the method, offer an online
executable version allowing the users to test it on their own
image-audio pairs and propose some improvements that could benefit the
model as future work.
More information about the announce
mailing list