Wed AM1.L5.2
ATTENTION-BASED METHODS FOR AUDIO QUESTION ANSWERING
Parthasaarathy Ariyakulam Sudarsanam, Tuomas Virtanen, Tampere University, Finland
Session:
Wed AM1.L5: Multimodal Learning for Audio and Language Lecture
Track:
Special Sessions
Location:
Press room
Presentation Time:
Wed, 6 Sep, 10:50 - 11:10 Finland Time (UTC +3)
Session Chair:
Xubo Liu, University of Surrey
Presentation
Discussion
Resources
No resources available.
Session Wed AM1.L5
Wed AM1.L5.1: KNOWLEDGE DISTILLATION FOR EFFICIENT AUDIO-VISUAL VIDEO CAPTIONING
Özkan Çaylı, Izmir Katip Çelebi University, Turkey; Xubo Liu, University of Surrey, United Kingdom; Volkan Kılıç, Izmir Katip Çelebi University, Turkey; Wenwu Wang, University of Surrey, Turkey
Wed AM1.L5.2: ATTENTION-BASED METHODS FOR AUDIO QUESTION ANSWERING
Parthasaarathy Ariyakulam Sudarsanam, Tuomas Virtanen, Tampere University, Finland
Wed AM1.L5.3: ENHANCING AUDIO RETRIEVAL WITH ATTENTION-BASED ENCODER FOR AUDIO FEATURE REPRESENTATION
Feiyang Xiao, Harbin Engineering University, China; Qiaoxi Zhu, University of Technology Sydney, Australia; Jian Guan, Harbin Engineering University, China; Wenwu Wang, University of Surrey, United Kingdom
Wed AM1.L5.4: Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Etienne Labbé, Julien Pinquier, Thomas Pellegrini, IRIT, France
Wed AM1.L5.5: Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study
Yi Yuan, Haohe Liu, University of Surrey, United Kingdom; Jinhua Liang, Queen Mary University of London, United Kingdom; Xubo Liu, Mark D. Plumbley, Wenwu Wang, University of Surrey, United Kingdom
Wed AM1.L5.6: ACES: EVALUATING AUTOMATED AUDIO CAPTIONING MODELS ON THE SEMANTICS OF SOUNDS
Gijs Wijngaard, Elia Formisano, Maastricht University, Netherlands; Bruno Giordano, CNRS and Université Aix-Marseille, Netherlands; Michel Dumontier, Maastricht University, Netherlands