AASP-P19.3

RFM-EDITING: RECTIFIED FLOW MATCHING FOR TEXT-GUIDED AUDIO EDITING

Liting Gao, Yi Yuan, Yaru Chen, Yuelan Cheng, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Zhenbo Li, Juan Wen, China Agricultural University, China; Shubin Zhang, Ocean University of China, China; Wenwu Wang, University of Surrey, United Kingdom of Great Britain and Northern Ireland

Session:
AASP-P19: Sound Generation and Synthesis Poster

Track:
Audio and Acoustic Signal Processing [AA]

Location:
Poster Area 25

Presentation Time:
Thu, 7 May, 14:00 - 16:00

Presentation
Discussion
Resources
No resources available.
Session AASP-P19
AASP-P19.1: Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
Yunyi Liu, University of Sydney, Australia; Shaofan Yang, Kai Li, Xu Li, Dolby Laboratories, China
AASP-P19.2: FXSEARCHER: GRADIENT-FREE TEXT-DRIVEN AUDIO TRANSFORMATION
Hojoon Ki, Jongsuk Kim, Minchan Kwon, Junmo Kim, Korea Advanced Institute of Science and Technology, Korea, Republic of
AASP-P19.3: RFM-EDITING: RECTIFIED FLOW MATCHING FOR TEXT-GUIDED AUDIO EDITING
Liting Gao, Yi Yuan, Yaru Chen, Yuelan Cheng, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Zhenbo Li, Juan Wen, China Agricultural University, China; Shubin Zhang, Ocean University of China, China; Wenwu Wang, University of Surrey, United Kingdom of Great Britain and Northern Ireland
AASP-P19.4: MIX2MORPH: LEARNING SOUND MORPHING FROM NOISY MIXES
Annie Chu, Hugo Flores-García, Northwestern University, United States of America; Oriol Nieto, Justin Salamon, Adobe Research, United States of America; Bryan Pardo, Northwestern University, United States of America; Prem Seetharaman, Adobe Research, United States of America
AASP-P19.5: GENERATIVE AUDIO EXTENSION AND MORPHING
Prem Seetharaman, Oriol Nieto, Justin Salamon, Adobe, United States of America
AASP-P19.6: Feedback-driven Retrieval-augmented Audio Generation with Large Audio Language Models
Junqi Zhao, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Chenxing Li, Tencent, China; Jinzheng Zhao, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Rilin Chen, Dong Yu, Tencent, China; Mark Plumbley, Wenwu Wang, University of Surrey, United Kingdom of Great Britain and Northern Ireland
AASP-P19.7: Taming Audio VAEs via Target-KL Regularization
Prem Seetharaman, Rithesh Kumar, Adobe Research, United States of America
AASP-P19.8: DIVERSE AND FEW-STEP AUDIO CAPTIONING VIA FLOW MATCHING
Naoaki Fujita, Hiroki Nakamura, Kosuke Itakura, Panasonic holdings, Japan
AASP-P19.9: FLASHFOLEY: FAST INTERACTIVE SKETCH2AUDIO GENERATION
Zachary Novack, UC San Diego, United States of America; Koichi Saito, Sony AI, United States of America; Zhi Zhong, Sony Group Corporation, Japan; Takashi Shibuya, Sony AI, United States of America; Shuyang Cui, Sony Group Corporation, Japan; Julian McAuley, Taylor Berg-Kirkpatrick, UC San Diego, United States of America; Christian Simon, Shusuke Takahashi, Sony Group Corporation, Japan; Yuki Mitsufuji, Sony AI, United States of America
AASP-P19.10: WavJourney: Compositional Audio Creation With Large Language Models
Xubo Liu, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Zhongkai Zhu, Reka AI, United Kingdom of Great Britain and Northern Ireland; Haohe Liu, Yi Yuan, Qiushi Huang, Meng Cui, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Jinhua Liang, QMUL, United Kingdom of Great Britain and Northern Ireland; Yin Cao, Xi’an Jiaotong-Liverpool University, China; Qiuqiang Kong, CUHK, United Kingdom of Great Britain and Northern Ireland; Mark Plumbley, KCL, China; Wenwu Wang, University of Surrey, China
Contacts