Technical Program

ASR-2: Automatic Speech Recognition II

Session Type: Poster
Time: Monday, December 16, 10:30 - 12:00
Location: VHS Event Centre, Level 1
Session Chair: Hemant Patil, Dhirubhai Ambani Institute of Information and Communication Technology
 
ASR-2.1: TRAINING LANGUAGE MODELS FOR LONG-SPAN CROSS-SENTENCE EVALUATION
Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney, RWTH Aachen University, Germany
 
ASR-2.2: TRANSFORMER ASR WITH CONTEXTUAL BLOCK PROCESSING
Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Sony Corporation, Japan; Shinji Watanabe, Johns Hopkins University, United States
 
ASR-2.3: A DENSITY RATIO APPROACH TO LANGUAGE MODEL FUSION IN END-TO-END AUTOMATIC SPEECH RECOGNITION
Erik McDermott, Hasim Sak, Ehsan Variani, Google Inc, United States
 
ASR-2.4: IMPROVING GRAPHEME-TO-PHONEME CONVERSION BY INVESTIGATING COPYING MECHANISM IN RECURRENT ARCHITECTURES
Abhishek Niranjan, Mahaboob Ali Basha Shaik, Samsung Research and Development Institute, India
 
ASR-2.5: A COMPARATIVE STUDY ON TRANSFORMER VS RNN IN SPEECH APPLICATIONS
Shigeki Karita, NTT Communication Science Laboratories, Japan; Nanxin Chen, Johns Hopkins University, United States; Tomoki Hayashi, Nagoya University, Japan; Takaaki Hori, Mitsubishi Electric Research Laboratories (MERL), United States; Hirofumi Inaguma, Kyoto University, Japan; Ziyan Jiang, Johns Hopkins University, United States; Masao Someki, Nagoya University, Japan; Nelson Enrique Yalta Soplin, Waseda University, Japan; Ryuichi Yamamoto, LINE Corporation, Japan; Xiaofei Wang, Shinji Watanabe, Johns Hopkins University, United States; Takenori Yoshimura, Nagoya University, Japan; Wangyou Zhang, Shanghai Jiao Tong University, China
 
ASR-2.6: FROM SENONES TO CHENONES: TIED CONTEXT-DEPENDENT GRAPHEMES FOR HYBRID SPEECH RECOGNITION
Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer, Facebook, United States
 
ASR-2.7: ATTENTION-BASED SPEECH RECOGNITION USING GAZE INFORMATION
Osamu Segawa, Chubu Electric Power Co., Inc., Japan; Tomoki Hayashi, Kazuya Takeda, Nagoya University, Japan
 
ASR-2.8: LISTENING WHILE SPEAKING AND VISUALIZING: IMPROVING ASR THROUGH MULTIMODAL CHAIN
Johanes Effendi, Nara Institute of Science and Technology / RIKEN Center for Advanced Intelligence Project AIP, Japan; Andros Tjandra, Nara Institute of Science and Technology, Japan; Sakriani Sakti, Satoshi Nakamura, Nara Institute of Science and Technology / RIKEN Center for Advanced Intelligence Project AIP, Japan
 
ASR-2.9: EMBEDDINGS FOR DNN SPEAKER ADAPTIVE TRAINING
Joanna Rownicka, Peter Bell, Steve Renals, University of Edinburgh, United Kingdom
 
ASR-2.10: LANGUAGE MODEL BOOTSTRAPPING USING NEURAL MACHINE TRANSLATION FOR CONVERSATIONAL SPEECH RECOGNITION
Surabhi Punjabi, Harish Arsikere, Sri Garimella, Amazon, India
 
ASR-2.11: SPEAKER AND LANGUAGE AWARE TRAINING FOR END-TO-END ASR
Shubham Bansal, Karan Malhotra, Sriram Ganapathy, Indian Institute of Science, Bangalore, India
 
ASR-2.12: DATA AUGMENTATION BASED ON VOWEL STRETCH FOR IMPROVING CHILDREN'S SPEECH RECOGNITION
Tohru Nagano, Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, IBM, Japan
 
ASR-2.13: MIXED BANDWIDTH ACOUSTIC MODELING LEVERAGING KNOWLEDGE DISTILLATION
Takashi Fukuda, Samuel Thomas, IBM, Japan
 
ASR-2.14: ON TEMPORAL CONTEXT INFORMATION FOR HYBRID BLSTM-BASED PHONEME RECOGNITION
Timo Lohrenz, Maximilian Strake, Tim Fingscheidt, Technische Universität Braunschweig, Germany
 
ASR-2.15: EXPLORING MODEL UNITS AND TRAINING STRATEGIES FOR END-TO-END SPEECH RECOGNITION
Mingkun Huang, YiZhou Lu, Shanghai Jiao Tong University, China; Lan Wang, Chinese Academy of Sciences, China; Yanmin Qian, Kai Yu, Shanghai Jiao Tong University, China
 
ASR-2.16: QUERY-BY-EXAMPLE ON-DEVICE KEYWORD SPOTTING
Byeonggeun Kim, Mingu Lee, Jinkyu Lee, Yeonseok Kim, Kyuwoong Hwang, Qualcomm, Korea (South)
 
ASR-2.17: SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
Xi Chen, Shouyi Yin, Tsinghua University, China; Dandan Song, Peng Ouyang, TsingMicro Co. Ltd., China; Leibo Liu, Shaojun Wei, Tsinghua University, China
 
ASR-2.18: SIMPLIFIED LSTMS FOR SPEECH RECOGNITION
George Saon, Zoltan Tuske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas, IBM, United States
 
ASR-2.19: GENERALIZED LARGE-CONTEXT LANGUAGE MODELS BASED ON FORWARD-BACKWARD HIERARCHICAL RECURRENT ENCODER-DECODER MODELS
Ryo Masumura, Mana Ihori, Tomohiro Tanaka, Itsumi Saito, Kyosuke Nishida, Takanobu Oba, NTT Corporation, Japan
 
ASR-2.20: END-TO-END TRAINING OF A LARGE VOCABULARY END-TO-END SPEECH RECOGNITION SYSTEM
Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda, Samsung Research, Korea (South)