AASP-L5.1
AUDIOGENIE-REASONER: A TRAINING-FREE MULTI-AGENT FRAMEWORK FOR COARSE-TO-FINE AUDIO DEEP REASONING
Yan Rong, The Hong Kong University of Science and Technology (Guangzhou), China; Chenxing Li, Dong Yu, Tencent AI Lab, China; Li Liu, The Hong Kong University of Science and Technology (Guangzhou), China
Session:
AASP-L5: Audio Understanding and Generation Oral
Track:
Audio and Acoustic Signal Processing [AA]
Location:
Room 127+128
Presentation Time:
Wed, 6 May, 16:30 - 16:50
Presentation
Discussion
Resources
No resources available.
Session AASP-L5
AASP-L5.1: AUDIOGENIE-REASONER: A TRAINING-FREE MULTI-AGENT FRAMEWORK FOR COARSE-TO-FINE AUDIO DEEP REASONING
Yan Rong, The Hong Kong University of Science and Technology (Guangzhou), China; Chenxing Li, Dong Yu, Tencent AI Lab, China; Li Liu, The Hong Kong University of Science and Technology (Guangzhou), China
AASP-L5.2: LAMB: LLM-BASED AUDIO CAPTIONING WITH MODALITY GAP BRIDGING VIA CAUCHY-SCHWARZ DIVERGENCE
Hyeongkeun Lee, Jongmin Choi, KiHyun Nam, Joon Son Chung, Korea Advanced Institute of Science and Technology, Korea, Republic of
AASP-L5.3: PICOAUDIO2: TEMPORAL CONTROLLABLE TEXT-TO-AUDIO GENERATION WITH NATURAL LANGUAGE DESCRIPTION
Zihao Zheng, Zeyu Xie, Xuenan Xu, SJTU, China; Wen Wu, Chao Zhang, Shanghai AI Lab, China; Mengyue Wu, SJTU, China
AASP-L5.4: FOLEYBENCH: A BENCHMARK FOR VIDEO-TO-AUDIO MODELS
Satvik Dixit, Carnegie Mellon University, United States of America; Koichi Saito, Sony AI, United States of America; Zhi Zhong, Sony Group Corporation, United States of America; Yuki Mitsufuji, Sony AI, United States of America; Chris Donahue, Carnegie Mellon University, United States of America
AASP-L5.5: SLAP: SCALABLE LANGUAGE-AUDIO PRETRAINING WITH VARIABLE-DURATION AUDIO AND MULTI-OBJECTIVE TRAINING
Xinhao Mei, Gael Le Lan, Haohe Liu, Zhaoheng Ni, Varun Nagaraja, Yang Liu, Yangyang Shi, Vikas Chandra, Meta, United States of America
AASP-L5.6: AUDIOCARDS: STRUCTURED METADATA IMPROVES AUDIO LANGUAGE MODELS FOR SOUND DESIGN
Sripathi Sridhar, New Jersey Institute of Technology, United States of America; Prem Seetharaman, Oriol Nieto, Adobe Research, United States of America; Mark Cartwright, New Jersey Institute of Technology, United States of America; Justin Salamon, Adobe Research, United States of America
Contacts