AASP-P14.5

PARAMETER EFFICIENT AUDIO CAPTIONING WITH FAITHFUL GUIDANCE USING AUDIO-TEXT SHARED LATENT REPRESENTATION

Arvind Krishna Sridhar, Yinyi Guo, Erik Visser, Rehana Mahfuz, Qualcomm Technologies, United States of America

Session:
AASP-P14: Sound events detection, description and generation Poster

Track:
Audio and Acoustic Signal Processing

Location:
Poster Zone 2C
Poster Board PZ-2C.5

Presentation Time:
Thu, 18 Apr, 13:10 - 15:10 (UTC +9)

Session Chair:
Annamaria Mesaros, Tampere University
View Manuscript
Presentation
Discussion
Resources
Session AASP-P14
AASP-P14.1: NATURAL LANGUAGE SUPERVISION FOR GENERAL-PURPOSE AUDIO REPRESENTATIONS
Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Microsoft, United States of America
AASP-P14.2: AUDIO-FREE PROMPT TUNING FOR LANGUAGE-AUDIO MODELS
Yiming Li, Xiangdong Wang, Hong Liu, Institute of Computing Technology, Chinese Academy of Sciences, China
AASP-P14.3: SEMANTIC PROXIMITY ALIGNMENT: TOWARDS HUMAN PERCEPTION-CONSISTENT AUDIO TAGGING BY ALIGNING WITH LABEL TEXT DESCRIPTION
Wuyang Liu, Yanzhen Ren, Wuhan University, China
AASP-P14.4: A DETAILED AUDIO-TEXT DATA SIMULATION PIPELINE USING SINGLE-EVENT SOUNDS
Xuenan Xu, Xiaohang Xu, Zeyu Xie, Pingyue Zhang, Mengyue Wu, Kai Yu, Shanghai Jiao Tong University, China
AASP-P14.5: PARAMETER EFFICIENT AUDIO CAPTIONING WITH FAITHFUL GUIDANCE USING AUDIO-TEXT SHARED LATENT REPRESENTATION
Arvind Krishna Sridhar, Yinyi Guo, Erik Visser, Rehana Mahfuz, Qualcomm Technologies, United States of America
AASP-P14.6: GRAPH ATTENTION FOR AUTOMATED AUDIO CAPTIONING
Feiyang Xiao, Jian Guan, Harbin Engineering University, China; Qiaoxi Zhu, University of Technology Sydney, Australia; Wenwu Wang, University of Surrey, United Kingdom of Great Britain and Northern Ireland
AASP-P14.7: STRONG LABELING OF SOUND EVENTS USING CROWDSOURCED WEAK LABELS AND ANNOTATOR COMPETENCE ESTIMATION
Irene Martín-Morató, Annamaria Mesaros, Tampere University, Finland
AASP-P14.8: CONTRASTIVE LOSS BASED FRAME-WISE FEATURE DISENTANGLEMENT FOR POLYPHONIC SOUND EVENT DETECTION
Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He, Harbin institute of technology, China
AASP-P14.9: LEARNING ONTOLOGY INFORMED REPRESENTATIONS WITH CONSTRAINTS FOR ACOUSTIC EVENT DETECTION
Akshay Raina, Sayeedul Islam Sheikh, Vipul Arora, Indian Institute of Technology Kanpur, India
AASP-P14.10: PERFORMANCE AND ENERGY BALANCE: A COMPREHENSIVE STUDY OF STATE-OF-THE-ART SOUND EVENT DETECTION SYSTEMS
Francesca Ronchini, Politecnico di Milano, Italy; Romain Serizel, Université de Lorraine, CNRS, Inria, Loria, France
Contacts