AASP-P20.3
Auditory Illusion Benchmark for Large Audio Language Models
Hayoon Kim, Eunice Hong, Kyogu Lee, Seoul National University, Korea, Republic of
Session:
AASP-P20: Data and Benchmark for Audio and Speech Poster
Track:
Audio and Acoustic Signal Processing [AA]
Location:
Poster Area 26
Presentation Time:
Thu, 7 May, 14:00 - 16:00
Presentation
Discussion
Resources
No resources available.
Session AASP-P20
AASP-P20.1: ECHOFAKE: A REPLAY-AWARE DATASET FOR PRACTICAL SPEECH DEEPFAKE DETECTION
Tong Zhang, Yihuan Huang, Yanzhen Ren, Wuhan University, China
AASP-P20.2: 3D MESH GRID ROOM IMPULSE RESPONSES MEASURED WITH A LINEAR MICROPHONE ARRAY AND SUPPRESSION OF FRAME REFLECTIONS
Yoichi Haneda, Yi Ren, The University of Electro-Communications, Japan
AASP-P20.3: Auditory Illusion Benchmark for Large Audio Language Models
Hayoon Kim, Eunice Hong, Kyogu Lee, Seoul National University, Korea, Republic of
AASP-P20.4: TAGARELA - A PORTUGUESE SPEECH DATASET FROM PODCASTS
Frederico Santos de Oliveira, Federal University of Mato Grosso (UFMT), Brazil; Lucas Rafael Stefanel Gris, Alef Iury Siqueira Ferreira, Federal University of Goias (UFG), Brazil; Augusto Seben da Rosa, Universidade Estadual Paulista, Brazil; Alexandre Costa Ferro Filho, Federal University of Goias (UFG), Brazil; Edresson Casanova, NVIDIA, Brazil; Christopher Dane Shulby, Elsa Speak, United States of America; Rafael Teixeira Sousa, Federal University of Mato Grosso (UFMT), Brazil; Diogo Fernandes Costa Silva, Anderson da Silva Soares, Arlindo Rodrigues Galvão Filho, Federal University of Goias (UFG), Brazil
AASP-P20.5: Representation-Based Data Quality Audits for Audio
Alvaro Gonzalez-Jimenez, Lucerne University of Applied Sciences and Arts, University Hospital of Basel, Switzerland; Fabian Gröger, Linda Wermelinger, Lucerne University of Applied Sciences and Arts, University of Basel, Switzerland; Andrin Bürli, Iason Kastanis, CSEM, Switzerland; Simone Lionetti, Marc Pouly, Lucerne University of Applied Sciences and Arts, Switzerland
AASP-P20.6: SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding
Bingsong Bai, Qihang Lu, Wenbing Yang, Beijing University of Posts and Telecommunications, China; Zihan Sun, Yueran Hou, Peilei Jia, Songbai Pu, Hello Group Inc., China; Ruibo Fu, Chinese Academy of Sciences, China; Yingming Gao, Ya Li, Beijing University of Posts and Telecommunications, China; Jun Gao, Hello Group Inc., China
AASP-P20.7: LOTUSDIS: A THAI FAR-FIELD MEETING CORPUS FOR ROBUST CONVERSATIONAL ASR
Pattara Tipaksorn, Sumonmas Thatphithakkul, Vataya Chunwijitra, Kwanchiva Thangthai, National Electronic and Computer Technology Center, Thailand
AASP-P20.8: A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks
Heriberto Cuayahuitl, Grace Jang, University of Lincoln, United Kingdom of Great Britain and Northern Ireland
AASP-P20.9: TAU: A BENCHMARK FOR CULTURAL SOUND UNDERSTANDING BEYOND SEMANTICS
Yi-Cheng Lin, National Taiwan University, Taiwan; Yu-Hua Chen, University of Toronto, Canada; Jia-Kai Dong, Yueh-Hsuan Huang, Szu-Chi Chen, Yu-Chen Chen, Chih-Yao Chen, Yu-Jung Lin, Yu-Ling Chen, Zih-Yu Chen, I-Ning Tsai, Hsiu-Hsuan Wang, Ho-Lam Chung, Ke-Han Lu, Hung-yi Lee, National Taiwan University, Taiwan
AASP-P20.10: LONGSPEECH: A SCALABLE BENCHMARK FOR TRANSCRIPTION, TRANSLATION AND UNDERSTANDING IN LONG SPEECH
Fei Yang, Shanghai Jiao Tong University, China; Xuanfan Ni, Alibaba International Digital Commerce, China; Renyi Yang, Delft University of Technology, Netherlands; Jiahui Geng, Qing Li, Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates; Chenyang Lyu, Yichao Du, Longyue Wang, Weihua Luo, Kaifu Zhang, Alibaba International Digital Commerce, China
Contacts