Technical Program

ASR-1: Automatic Speech Recognition I

Session Type: Poster
Time: Sunday, December 15, 10:30 - 12:00
Location: VHS Event Centre, Level 1
Session Chair: Koichi Shinoda, Tokyo Institute of Technology
 
ASR-1.1: INCREMENTAL LATTICE DETERMINIZATION FOR WFST DECODERS
Zhehuai Chen, Shanghai Jiao Tong University, China; Mahsa Yarmohammadi, Hainan Xu, Johns Hopkins University, United States; Hang Lv, Lei Xie, Northwestern Polytechnical University, China; Daniel Povey, Sanjeev Khudanpur, Johns Hopkins University, United States
 
ASR-1.2: A COMPARISON OF TRANSFORMER AND LSTM ENCODER DECODER MODELS FOR ASR
Albert Zeyer, Parnia Bahar, Kazuki Irie, Ralf Schlüter, Hermann Ney, RWTH Aachen University, Germany
 
ASR-1.3: A DROPOUT-BASED SINGLE MODEL COMMITTEE APPROACH FOR ACTIVE LEARNING IN ASR
Jiayi Fu, Kuang Ru, Zhuiyi Technology Company, China
 
ASR-1.4: PERSONALIZATION OF END-TO-END SPEECH RECOGNITION ON MOBILE DEVICES FOR NAMED ENTITIES
Khe Chai Sim, Francoise Beaufays, Arnaud Benard, Dhruv Guliani, Andreas Kabel, Nikhil Khare, Tamar Lucassen, Petr Zadrazil, Harry Zhang, Leif Johnson, Giovanni Motta, Lillian Zhou, Google, United States
 
ASR-1.5: SIMULTANEOUS SPEECH RECOGNITION AND SPEAKER DIARIZATION FOR MONAURAL DIALOGUE RECORDINGS WITH TARGET-SPEAKER ACOUSTIC MODELS
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Hitachi, Ltd., Japan; Shinji Watanabe, Johns Hopkins University, United States
 
ASR-1.6: INTEGRATING SOURCE-CHANNEL AND ATTENTION-BASED SEQUENCE-TO-SEQUENCE MODELS FOR SPEECH RECOGNITION
Qiujia Li, Chao Zhang, Phil Woodland, University of Cambridge, United Kingdom
 
ASR-1.7: AN INVESTIGATION INTO THE EFFECTIVENESS OF ENHANCEMENT IN ASR TRAINING AND TEST FOR CHIME-5 DINNER PARTY TRANSCRIPTION
Catalin Zorila, Toshiba Cambridge Research Laboratory, United Kingdom; Christoph Boeddeker, Paderborn University, Germany; Rama Doddipatla, Toshiba Cambridge Research Laboratory, United Kingdom; Reinhold Haeb-Umbach, Paderborn University, Germany
 
ASR-1.8: STATE-OF-THE-ART SPEECH RECOGNITION USING MULTI-STREAM SELF-ATTENTION WITH DILATED 1D CONVOLUTIONS
Kyu Han, Ramon Prieto, Tao Ma, ASAPP, Inc., United States
 
ASR-1.9: HIGHLY EFFICIENT NEURAL NETWORK LANGUAGE MODEL COMPRESSION USING SOFT BINARIZATION TRAINING
Rao Ma, Qi Liu, Kai Yu, Shanghai Jiao Tong University, China
 
ASR-1.10: IMPROVED MULTI-STAGE TRAINING OF ONLINE ATTENTION-BASED ENCODER-DECODER MODELS
Abhinav Garg, Dhananjaya Gowda, Ankur Kumar, Kwangyoun Kim, Mehul Kumar, Chanwoo Kim, Samsung Research, Korea (South)
 
ASR-1.11: LEAD2GOLD: TOWARDS EXPLOITING THE FULL POTENTIAL OF NOISY TRANSCRIPTIONS FOR SPEECH RECOGNITION
Adrien Dufraux, Facebook AI Research, France; Emmanuel Vincent, INRIA, France; Awni Hannun, Facebook AI Research, United States; Armelle Brun, Université de Lorraine, France; Matthijs Douze, Facebook AI Research, France
 
ASR-1.12: ORTHOGONALITY CONSTRAINED MULTI-HEAD ATTENTION FOR KEYWORD SPOTTING
Mingu Lee, Jinkyu Lee, Hye Jin Jang, Byeonggeun Kim, Wonil Chang, Kyuwoong Hwang, Qualcomm AI Research, Korea (South)
 
ASR-1.13: LEARNING BETWEEN DIFFERENT TEACHER AND STUDENT MODELS IN ASR
Jeremy Heng Meng Wong, Microsoft, United States; Mark John Francis Gales, Yu Wang, University of Cambridge, United Kingdom
 
ASR-1.14: A UNIFIED ENDPOINTER USING MULTITASK AND MULTIDOMAIN TRAINING
Shuo-Yiin Chang, Bo Li, Gabor Simko, Google, United States
 
ASR-1.15: DOMAIN EXPANSION IN DNN-BASED ACOUSTIC MODELS FOR ROBUST SPEECH RECOGNITION
Shahram Ghorbani, Soheil Khorram, John H.L. Hansen, University of Texas at Dallas, United States
 
ASR-1.16: IMPROVING RNN TRANSDUCER MODELING FOR END-TO-END SPEECH RECOGNITION
Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong, Microsoft, United States
 
ASR-1.17: SIMPLE GATED CONVNET FOR SMALL FOOTPRINT ACOUSTIC MODELING
Lukas Lee, Jinhwan Park, Wonyong Sung, Seoul National University, Korea (South)
 
ASR-1.18: GANS FOR CHILDREN: A GENERATIVE DATA AUGMENTATION STRATEGY FOR CHILDREN SPEECH RECOGNITION
Peiyao Sheng, Zhuolin Yang, Yanmin Qian, Shanghai Jiao Tong University, China
 
ASR-1.19: ESPRESSO: A FAST END-TO-END NEURAL SPEECH RECOGNITION TOOLKIT
Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Johns Hopkins University, United States; Hang Lv, Northwestern Polytechnical University, China; Yiwen Shao, Johns Hopkins University, United States; Nanyun Peng, University of Southern California, United States; Lei Xie, Northwestern Polytechnical University, China; Shinji Watanabe, Sanjeev Khudanpur, Johns Hopkins University, United States