SLP-P19.12
MULTITASK SPEECH RECOGNITION AND SPEAKER CHANGE DETECTION FOR UNKNOWN NUMBER OF SPEAKERS
Shashi Kumar, Srikanth Madikeri, Iuliia Nigmatulina, Esaú Villatoro-Tello, Petr Motlicek, Idiap Research Institute, Martigny, Switzerland; Karthik Pandia, S. Pavankumar Dubagunta, Aravind Ganapathiraju, Uniphore Software Systems Inc., Palo Alto, CA, USA, India
Session:
SLP-P19: Speaker Diarization II Poster
Track:
Speech and Language Processing
Location:
Poster Zone 3B
Poster Board PZ-3B.12
Poster Board PZ-3B.12
Presentation Time:
Thu, 18 Apr, 08:20 - 10:20 (UTC +9)
Session Co-Chairs:
Liping Chen, University of Science and Technology of China and Rohan Kumar Das, Fortemedia Singapore
Session SLP-P19
SLP-P19.1: FRAME-WISE STREAMING END-TO-END SPEAKER DIARIZATION WITH NON-AUTOREGRESSIVE SELF-ATTENTION-BASED ATTRACTORS
Di Liang, Nian Shao, Zhejiang University; Westlake University, China; Xiaofei Li, Westlake University; Westlake Institute for Advanced Study, China
SLP-P19.2: NTT SPEAKER DIARIZATION SYSTEM FOR CHIME-7: MULTI-DOMAIN, MULTI-MICROPHONE END-TO-END AND VECTOR CLUSTERING DIARIZATION
Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa, NTT Communication Science Laboratories, Japan
SLP-P19.3: ONLINE SPEAKER DIARIZATION OF MEETINGS GUIDED BY SPEECH SEPARATION
Elio Gruttadauria, Mathieu Fontaine, Slim Essid, Télécom Paris, France
SLP-P19.4: ENHANCING LOW-LATENCY SPEAKER DIARIZATION WITH SPATIAL DICTIONARY LEARNING
Weiguang Chen, Hunan University, China; The Anh Tran, Nanyang Technological University, Singapore; Xionghu Zhong, Hunan University, China; Eng Siong Chng, Nanyang Technological University, Singapore
SLP-P19.5: NEURAL SPEAKER DIARIZATION USING MEMORY-AWARE MULTI-SPEAKER EMBEDDING WITH SEQUENCE-TO-SEQUENCE ARCHITECTURE
GaoBin Yang, MaoKui He, Shutong Niu, Ruoyu Wang, University of Science and Technology of China, China, China; Yanyan Yue, Shuangqing Qian, iFlytek Research, China, China; Shilong Wu, Jun Du, University of Science and Technology of China, China, China; Chinhui Lee, Georgia Institute of Technology, United States of America
SLP-P19.6: USM-SCD: MULTILINGUAL SPEAKER CHANGE DETECTION BASED ON LARGE PRETRAINED FOUNDATION MODELS
Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang, Google LLC, United States of America
SLP-P19.7: APOLLO'S UNHEARD VOICES: GRAPH ATTENTION NETWORKS FOR SPEAKER DIARIZATION AND CLUSTERING FOR FEARLESS STEPS APOLLO COLLECTION
Meena Chandra Shekar, John Hansen, The University of Texas at Dallas, United States of America
SLP-P19.8: GEODESIC INTERPOLATION OF FRAME-WISE SPEAKER EMBEDDINGS FOR THE DIARIZATION OF MEETING SCENARIOS
Tobias Cord-Landwehr, Christoph Boeddeker, Paderborn University, Germany; Cătălin Zorilă, Rama Doddipatla, Toshiba Europe Ltd, United Kingdom of Great Britain and Northern Ireland; Reinhold Haeb-Umbach, Paderborn University, Germany
SLP-P19.9: PROFILE-ERROR-TOLERANT TARGET-SPEAKER VOICE ACTIVITY DETECTION
Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu, Microsoft, United States of America
SLP-P19.10: IMPROVING NEURAL DIARIZATION THROUGH SPEAKER ATTRIBUTE ATTRACTORS AND LOCAL DEPENDENCY MODELING
David Palzer, The Ohio State University, United States of America; Matthew Maciejewski, Johns Hopkins University, United States of America; Eric Fosler-Lussier, The Ohio State University, United States of America
SLP-P19.11: A SPATIAL LONG-TERM ITERATIVE MASK ESTIMATION APPROACH FOR MULTI-CHANNEL SPEAKER DIARIZATION AND SPEECH RECOGNITION
Feng Ma, University of Science and Technology of China, China; Yanhui Tu, iFlytek, China; Maokui He, Ruoyu Wang, Shutong Niu, University of Science and Technology of China, China; Lei Sun, iFlytek, China; Zhongfu Ye, Jun Du, University of Science and Technology of China, China; Jia Pan, iFlytek, China; Chin-Hui Lee, Georgia Institute of Technology, United States of America
SLP-P19.12: MULTITASK SPEECH RECOGNITION AND SPEAKER CHANGE DETECTION FOR UNKNOWN NUMBER OF SPEAKERS
Shashi Kumar, Srikanth Madikeri, Iuliia Nigmatulina, Esaú Villatoro-Tello, Petr Motlicek, Idiap Research Institute, Martigny, Switzerland; Karthik Pandia, S. Pavankumar Dubagunta, Aravind Ganapathiraju, Uniphore Software Systems Inc., Palo Alto, CA, USA, India
Contacts