AASP-P11: Audio and Speech Quality and Intelligibility Measures III
Poster
Wed, 6 May, 14:00 - 16:00
Location: Poster Area 26
Session Type: Poster
Track: Audio and Acoustic Signal Processing [AA]
Click the to view the manuscript on IEEE Xplore Open Preview

AASP-P11.1: WHEN NOISE LOWERS THE LOSS: RETHINKING LIKELIHOOD-BASED EVALUATION IN MUSIC LARGE LANGUAGE MODELS

Xiaosha Li, Georgia Institute of Technology, United States of America; Chun Liu, ByteDance Inc., United States of America; Ziyu Wang, Courant Institute of Mathematical Sciences, New York University; Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE, United States of America

AASP-P11.2: MUSHRA–1S: A SCALABLE AND SENSITIVE TEST APPROACH FOR EVALUATING TOP-TIER SPEECH PROCESSING SYSTEMS

Laura Lechler, Ivana Balić, Cisco Systems, United Kingdom of Great Britain and Northern Ireland

AASP-P11.3: A GENERALIZATION STRATEGY FOR SPEECH QUALITY PREDICTION: FROM DOMAIN-SPECIFIC TO UNIFIED DATASETS

Imran Kibria, Ada Lamba, Donald Williamson, The Ohio State University, United States of America

AASP-P11.4: CAN HIERARCHICAL CROSS-MODAL FUSION PREDICT HUMAN PERCEPTION OF AI DUBBED CONTENT?

Ashwini Dasare, Nirmesh Shah, Ashishkumar Gudmalwar, Pankaj Wasnik, Sony Research India, India

AASP-P11.5: PADAM: Perceptual Audio Defect Assessment Model

Alex Mackin, Pratha Khandelwal, Veneta Haralampieva, Michael Lau, Benoit Vallade, David Higham, Josh Anderson, Amazon Prime Video, United Kingdom of Great Britain and Northern Ireland

AASP-P11.6: UNSEEN BUT NOT UNKNOWN: USING DATASET CONCEALMENT TO ROBUSTLY EVALUATE SPEECH QUALITY ESTIMATION MODELS

Jaden Pieper, Stephen Voran, Institute for Telecommunication Sciences, United States of America

AASP-P11.7: RHO-PERFECT: CORRELATION CEILING FOR SUBJECTIVE EVALUATION DATASETS

Fredrik Cumlin, KTH Royal Institute of Technology, Sweden

AASP-P11.8: ENHANCED GENERATIVE MACHINE LISTENER

Vishnu Raj, Gouthaman KV, Shiv Gehlot, Dolby Laboratories India Pvt Ltd, India; Lars Villemoes, Dolby Sweden AB, Sweden; Arijit Biswas, Dolby Germany GmbH, Germany

AASP-P11.9: MULTI-TASK LEARNING FOR SPEECH QUALITY ASSESSMENT USING ASR-DERIVED ENTROPY FEATURES

Tri Dung Do, Bao Thang Ta, Viettel AI, Viettel Group, Viet Nam; Van Hai Do, Thuyloi University, Viet Nam

AASP-P11.10: FUSEMOS: PERCEPTUAL EVALUATION OF TEXT-TO-MUSIC GENERATION WITH DUAL-ENCODER FUSION AND RANKING-AWARE COMPOSITE LOSS

Yang Jing, Wuhan University, China; Wang Haoyu, Pan Ningning, Southwestern University of Finance and Economics, China; Wang Zhao, Yang Jianxuan, Xiaomi’s Wuhan Headquartes, China; Huang Gongping, Wuhan University, China