IEEE ICASSP 2026 || Barcelona, Spain || 4-8 May 2026

AASP-L4.4

BRIDGING THE SEMANTIC GAP: CROSS-ATTENTIVE FUSION FOR JOINT ACOUSTIC-SEMANTIC SPEECH QUALITY ASSESSMENT

Zhaoyang Wang, Chengzhong Wang, Jiale Zhao, Dingding Yao, Institute of Acoustics, Chinese Academy of Sciences; University of Chinese Academy of Sciences, China; Jing Wang, Beijing Institute of Technology, China; Junfeng Li, Institute of Acoustics, Chinese Academy of Sciences; University of Chinese Academy of Sciences, China

Session:

AASP-L4: Audio and Speech Quality and Intelligibility Measures II Oral

Location:

Room 127+128

Presentation Time:

Wed, 6 May, 15:00 - 15:20

Session Co-Chairs:

Robin Scheibler, Google DeepMind and Marc Delcroix, NTT

View Manuscript

Session AASP-L4

AASP-L4.1: DEEPAQ: A PERCEPTUAL AUDIO QUALITY METRIC BASED ON FOUNDATIONAL MODELS AND WEAKLY SUPERVISED LEARNING

Guanxin Jiang, International Audio Laboratories Erlangen, Germany; Andreas Brendel, Pablo Manuel Delgado, Fraunhofer IIS, Germany; Jürgen Herre, International Audio Laboratories Erlangen, Fraunhofer IIS, Germany

AASP-L4.2: THE 3RD CLARITY PREDICTION CHALLENGE: A MACHINE LEARNING CHALLENGE FOR HEARING AID SPEECH INTELLIGIBILITY PREDICTION

Jonathan Barker, University of Sheffield, United Kingdom of Great Britain and Northern Ireland; Michael Akeroyd, University of Nottingham, United Kingdom of Great Britain and Northern Ireland; Trevor Cox, University of Salford, United Kingdom of Great Britain and Northern Ireland; John Culling, University of Cardiff, United Kingdom of Great Britain and Northern Ireland; Jennifer Firth, University of Nottingham, United Kingdom of Great Britain and Northern Ireland; Simone Graetzer, University of Salford, United Kingdom of Great Britain and Northern Ireland; Graham Naylor, University of Nottingham, United Kingdom of Great Britain and Northern Ireland

AASP-L4.3: QUALITY ASSESSMENT OF NOISY AND ENHANCED SPEECH WITH LIMITED DATA: UWB-NTIS SYSTEM FOR VOICEMOS 2024

Marie Kunešová, Aleš Pražák, Jan Lehečka, University of West Bohemia in Pilsen, Czechia

AASP-L4.4: BRIDGING THE SEMANTIC GAP: CROSS-ATTENTIVE FUSION FOR JOINT ACOUSTIC-SEMANTIC SPEECH QUALITY ASSESSMENT

Zhaoyang Wang, Chengzhong Wang, Jiale Zhao, Dingding Yao, Institute of Acoustics, Chinese Academy of Sciences; University of Chinese Academy of Sciences, China; Jing Wang, Beijing Institute of Technology, China; Junfeng Li, Institute of Acoustics, Chinese Academy of Sciences; University of Chinese Academy of Sciences, China

AASP-L4.5: SA-SSL-MOS: SELF-SUPERVISED LEARNING MOS PREDICTION WITH SPECTRAL AUGMENTATION FOR GENERALIZED MULTI-RATE SPEECH ASSESSMENT

Fengyuan Cao, Xinyu Liang, Fredrik Cumlin, KTH Royal Institute of Technology, Sweden; Victor Ungureanu, Chandan K.A.Reddy, Christian Schuldt, Google LLC, Switzerland; Saikat Chatterjee, KTH Royal Institute of Technology, Sweden

AASP-L4.6: QASTANET: A DNN-BASED QUALITY METRIC FOR SPATIAL AUDIO

Adrien Llave, Emma Granier, Grégory Pallone, Orange, France

Contact | Accessibility | Nondiscrimination Policy | IEEE Ethics Reporting | IEEE Privacy Policy | Terms | Signal Processing Society

©2026 IEEE – All rights reserved.

Last updated Last updated 22 April 2026.

Use of this website signifies your agreement to the IEEE Terms and Conditions.

Support: webmaster@2026.ieeeicassp.org Host: https://cmsworldwide.com/