AASP-P24.3
MIDI-LLaMA: An Instruction-Following Multimodal LLM for Symbolic Music Understanding
Meng Yang, Jon McCormack, Monash University, Australia; Maria Teresa Llano, University of Sussex, United Kingdom of Great Britain and Northern Ireland; Wanchao Su, Monash University, Australia; Chao Lei, The University of Melbourne, Australia
Session:
AASP-P24: Generative Models for Music Poster
Track:
Audio and Acoustic Signal Processing [AA]
Location:
Poster Area 26
Presentation Time:
Thu, 7 May, 16:30 - 18:30
Presentation
Discussion
Resources
No resources available.
Session AASP-P24
AASP-P24.1: TINYMU: A COMPACT AUDIO-LANGUAGE MODEL FOR MUSIC UNDERSTANDING
Xiquan Li, Aurian Quelennec, Slim Essid, Télécom Paris, Institut Polytechnique de Paris, France
AASP-P24.2: HIERARCHICAL TOKENIZATION OF MULTIMODAL MUSIC DATA FOR GENERATIVE MUSIC RETRIEVAL
Wo Jae Lee, Rifat Joyee, Zhonghao Luo, Sudev Mukherjee, Emanuele Coviello, Amazon Music, United States of America
AASP-P24.3: MIDI-LLaMA: An Instruction-Following Multimodal LLM for Symbolic Music Understanding
Meng Yang, Jon McCormack, Monash University, Australia; Maria Teresa Llano, University of Sussex, United Kingdom of Great Britain and Northern Ireland; Wanchao Su, Monash University, Australia; Chao Lei, The University of Melbourne, Australia
AASP-P24.4: POLY-SVC: POLYPHONY-AWARE SINGING VOICE CONVERSION WITH HARMONIC MODELING
Chen Geng, Beijing University of Civil Engineering and Architecture, China; Meng Chen, Tencent Music Entertainment, China; Ruohua Zhou, Beijing University of Civil Engineering and Architecture, China; Ruolan Liu, Independent Researcher, China; Weifeng Zhao, Tencent Music Entertainment, China
AASP-P24.5: FINE-TUNING BIGVGAN-V2 FOR ROBUST MUSICAL TUNING PRESERVATION
Hans-Ulrich Berendes, Ben Maman, Meinard Müller, International Audio Laboratories Erlangen, Germany
AASP-P24.6: RETHINKING MUSIC CAPTIONING WITH MUSIC METADATA LLMS
Irmak Bukey, Carnegie Mellon University, United States of America; Zhepei Wang, Adobe Research, United States of America; Chris Donahue, Carnegie Mellon University, United States of America; Nicholas J. Bryan, Adobe Research, United States of America
AASP-P24.7: STYLEPITCHER: GENERATING STYLE-FOLLOWING AND EXPRESSIVE PITCH CURVES FOR VERSATILE SINGING TASKS
Jingyue Huang, Qihui Yang, University of California San Diego, United States of America; Fei-Yueh Chen, University of Rochester, United States of America; Julian McAuley, University of California San Diego, United States of America; Randal Leistikow, Perry R. Cook, Yongyi Zang, Smule Labs, United States of America
AASP-P24.8: MITIGATING DATA REPLICATION IN TEXT-TO-AUDIO GENERATIVE DIFFUSION MODELS THROUGH ANTI-MEMORIZATION GUIDANCE
Francisco Messina, Francesca Ronchini, Luca Comanducci, Paolo Bestagini, Fabio Antonacci, Politecnico di Milano, Italy
AASP-P24.9: Symphony Rendering: MIDI and Composer-Conditioned Auto Orchestration with Flow-Matching Transformers
Jiahe Lei, Qiuqiang Kong, The Chinese University of Hong Kong, Hong Kong, China, Hong Kong
Contacts