Technical Program

Click on the icon to view the manuscript on IEEE XPlore in the IEEE ICASSP 2020 Open Preview.

MLSP-P2: Applications in Speech and Audio

Session Type: Poster
Time: Tuesday, 5 May, 11:30 - 13:30
Location: On-Demand
Virtual Session: View on Virtual Platform
Session Chair: Ritwik Giri, Amazon Web Services (AWS)
 
 MLSP-P2.1: TOWARDS BLIND QUALITY ASSESSMENT OF CONCERT AUDIO RECORDINGS USING DEEP NEURAL NETWORKS
         Nikonas Simou; University of Crete
         Yannis Mastorakis; Foundation for Research and Technology-Hellas (FORTH)
         Nikolaos Stefanakis; Foundation for Research and Technology-Hellas (FORTH)
 
 MLSP-P2.3: MULTI-LABEL SOUND EVENT RETRIEVAL USING A DEEP LEARNING-BASED SIAMESE STRUCTURE WITH A PAIRWISE PRESENCE MATRIX
         Jianyu Fan; Simon Fraser University
         Eric Nichols; Microsoft
         Daniel Tompkins; Microsoft
         Ana Elisa Méndez Méndez; New York University
         Benjamin Elizalde; Carnegie Mellon University
         Philippe Pasquier; Simon Fraser University
 
 MLSP-P2.4: SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES
         Triantafyllos Kefalas; Imperial College London
         Konstantinos Vougioukas; Imperial College London
         Yannis Panagakis; Imperial College London
         Stavros Petridis; Imperial College London and Samsung AI Centre Cambridge
         Jean Kossaifi; Imperial College London and Samsung AI Centre Cambridge
         Maja Pantic; Imperial College London and Samsung AI Centre Cambridge
 
 MLSP-P2.5: SED-MDD: TOWARDS SENTENCE DEPENDENT END-TO-END MISPRONUNCIATION DETECTION AND DIAGNOSIS
         Yiqing Feng; Harbin Institute of Technology
         Guanyu Fu; Harbin Institute of Technology
         Qingcai Chen; Harbin Institute of Technology
         Kai Chen; Harbin Institute of Technology
 
 MLSP-P2.6: GENERATIVE PRE-TRAINING FOR SPEECH WITH AUTOREGRESSIVE PREDICTIVE CODING
         Yu-An Chung; Massachusetts Institute of Technology
         James Glass; Massachusetts Institute of Technology
 
 MLSP-P2.7: STARGAN FOR EMOTIONAL SPEECH CONVERSION: VALIDATED BY DATA AUGMENTATION OF END-TO-END EMOTION RECOGNITION
         Georgios Rizos; Imperial College London
         Alice Baird; University of Augsburg
         Max Elliott; Imperial College London
         Björn Schuller; Imperial College London
 
 MLSP-P2.8: MULTIMODAL TRANSFORMER FUSION FOR CONTINUOUS EMOTION RECOGNITION
         Jian Huang; Institute of Automation, Chinese Academy of Sciences
         Jianhua Tao; Institute of Automation, Chinese Academy of Sciences
         Bin Liu; Institute of Automation, Chinese Academy of Sciences
         Zheng Lian; Institute of Automation, Chinese Academy of Sciences
         Mingyue Niu; Institute of Automation, Chinese Academy of Sciences
 
 MLSP-P2.9: HKA: A HIERARCHICAL KNOWLEDGE ATTENTION MECHANISM FOR MULTI-TURN DIALOGUE SYSTEM
         Jian Song; Tsinghua University
         Kailai Zhang; Tsinghua University
         Xuesi Zhou; Tsinghua University
         Ji Wu; Tsinghua University
 
 MLSP-P2.10: SUBMODULAR RANK AGGREGATION ON SCORE-BASED PERMUTATIONS FOR DISTRIBUTED AUTOMATIC SPEECH RECOGNITION
         Jun Qi; Georgia Institute of Technology
         Chao-Han Huck Yang; Georgia Institute of Technology
         Javier Tejedor; Universidad San Pablo-CEU, CEU Universities
 
 MLSP-P2.11: BRIDGING MIXTURE DENSITY NETWORKS WITH META-LEARNING FOR AUTOMATIC SPEAKER IDENTIFICATION
         Ruirui Li; University of California, Los Angeles
         Jyun-Yu Jiang; University of California, Los Angeles
         Xian Wu; University of Notre Dame
         Hongda Mao; Amazon, Inc.
         Chu-Cheng Hsieh; Amazon, Inc.
         Wei Wang; University of California, Los Angeles
 
 MLSP-P2.12: PITCH ESTIMATION VIA SELF-SUPERVISION
         Beat Gfeller; Google
         Christian Frank; Google
         Dominik Roblek; Google
         Matt Sharifi; Google
         Marco Tagliasacchi; Google
         Mihajlo Velimirovic; Google