AASP-L9: Audio-Language Processing and Audio Captioning
Thu, 18 Apr, 13:10 - 15:10 (UTC +9)
Location: Room E1
Session Type: Lecture
Session Co-Chairs: Jonathan Le Roux, Mitsubishi Electric Research Laboratories and Wenwu Wang, University of Surrey
Track: Audio and Acoustic Signal Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Thu, 18 Apr, 13:30 - 13:50 (UTC +9)
AASP-L9.2: IMPROVING AUDIO CAPTIONING MODELS WITH FINE-GRAINED AUDIO FEATURES, TEXT EMBEDDING SUPERVISION, AND LLM MIX-UP AUGMENTATION
Thu, 18 Apr, 14:30 - 14:50 (UTC +9)
AASP-L9.5: LEARNING AUDIO CONCEPTS FROM COUNTERFACTUAL NATURAL LANGUAGE
Thu, 18 Apr, 14:50 - 15:10 (UTC +9)