Wed AM1.L5: Multimodal Learning for Audio and Language
Wed, 6 Sep, 10:30 - 12:30 Finland Time (UTC +3)
Location: Press room
Session Type: Lecture
Session Chair: Xubo Liu, University of Surrey
Track: Special Sessions
Wed, 6 Sep, 10:30 - 10:50 Finland Time (UTC +3)
Wed AM1.L5.1: KNOWLEDGE DISTILLATION FOR EFFICIENT AUDIO-VISUAL VIDEO CAPTIONING
Wed, 6 Sep, 10:50 - 11:10 Finland Time (UTC +3)
Wed AM1.L5.2: ATTENTION-BASED METHODS FOR AUDIO QUESTION ANSWERING
Wed, 6 Sep, 11:10 - 11:30 Finland Time (UTC +3)
Wed AM1.L5.3: ENHANCING AUDIO RETRIEVAL WITH ATTENTION-BASED ENCODER FOR AUDIO FEATURE REPRESENTATION
Wed, 6 Sep, 11:30 - 11:50 Finland Time (UTC +3)
Wed AM1.L5.4: Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Wed, 6 Sep, 11:50 - 12:10 Finland Time (UTC +3)
Wed AM1.L5.5: Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study
Wed, 6 Sep, 12:10 - 12:30 Finland Time (UTC +3)