Technical Program

Click on the icon to view the manuscript on IEEE XPlore in the IEEE ICASSP 2020 Open Preview.

AUD-P2: Deep Learning for Speech and Audio

Session Type: Poster
Time: Tuesday, 5 May, 11:30 - 13:30
Location: On-Demand
Virtual Session: View on Virtual Platform
Session Chair: Shoko Araki, NTT
 
 AUD-P2.1: WAWENETS: A NO-REFERENCE CONVOLUTIONAL WAVEFORM-BASED APPROACH TO ESTIMATING NARROWBAND AND WIDEBAND SPEECH QUALITY
         Andrew Catellier; Institute for Telecommunication Sciences
         Stephen Voran; Institute for Telecommunication Sciences
 
 AUD-P2.2: A NEURAL NETWORK FOR MONAURAL INTRUSIVE SPEECH INTELLIGIBILITY PREDICTION
         Mathias Bach Pedersen; Aalborg University
         Asger Heidemann Andersen; Oticon A/S
         Søren Holdt Jensen; Aalborg University
         Jesper Jensen; Aalborg University
 
 AUD-P2.3: SOURCE CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL
         Roy Fejgin; Dolby Laboratories
         Janusz Klejsa; Dolby Sweden AB
         Lars Villemoes; Dolby Sweden AB
         Cong Zhou; Dolby Laboratories
 
 AUD-P2.4: FULL-REFERENCE SPEECH QUALITY ESTIMATION WITH ATTENTIONAL SIAMESE NEURAL NETWORKS
         Gabriel Mittag; Technische Universität Berlin
         Sebastian Möller; Technische Universität Berlin
 
 AUD-P2.5: ENHANCED METHOD OF AUDIO CODING USING CNN-BASED SPECTRAL RECOVERY WITH ADAPTIVE STRUCTURE
         Seong-Hyeon Shin; Kwangwoon University
         Seung Kwon Beack; Electronics and Telecommunications Research Institute (ETRI)
         Wootaek Lim; Electronics and Telecommunications Research Institute (ETRI)
         Hochong Park; Kwangwoon University
 
 AUD-P2.6: AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS
         Arijit Biswas; Dolby Germany GmbH
         Dai Jia; Dolby Laboratories
 
 AUD-P2.7: EFFICIENT AND SCALABLE NEURAL RESIDUAL WAVEFORM CODING WITH COLLABORATIVE QUANTIZATION
         Kai Zhen; Indiana University
         Mi Suk Lee; Electronics and Telecommunications Research Institute (ETRI)
         Jongmo Sung; Electronics and Telecommunications Research Institute (ETRI)
         Seungkwon Beack; Electronics and Telecommunications Research Institute (ETRI)
         Minje Kim; Indiana University
 
 AUD-P2.8: A DUAL-STAGED CONTEXT AGGREGATION METHOD TOWARDS EFFICIENT END-TO-END SPEECH ENHANCEMENT
         Kai Zhen; Indiana University
         Mi Suk Lee; Electronics and Telecommunications Research Institute (ETRI)
         Minje Kim; Indiana University
 
 AUD-P2.9: A RECURRENT VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT
         Simon Leglaive; CentraleSupélec, IETR
         Xavier Alameda-Pineda; Inria Grenoble Rhone-Alpes
         Laurent Girin; Univ. Grenoble Alpes, Grenoble INP, GIPSA-lab
         Radu Horaud; Inria Grenoble Rhone-Alpes
 
 AUD-P2.10: SPEAKERFILTER: DEEP LEARNING-BASED TARGET SPEAKER EXTRACTION USING ANCHOR SPEECH
         ShuLin He; Inner Mongolia University
         Hao Li; Inner Mongolia University
         XueLiang Zhang; Inner Mongolia University
 
 AUD-P2.11: TACKLING REAL NOISY REVERBERANT MEETINGS WITH ALL-NEURAL SOURCE SEPARATION, COUNTING, AND DIARIZATION SYSTEM
         Keisuke Kinoshita; NTT Corporation
         Marc Delcroix; NTT Corporation
         Shoko Araki; NTT Corporation
         Tomohiro Nakatani; NTT Corporation
 
 AUD-P2.12: TIME-DOMAIN AUDIO SOURCE SEPARATION BASED ON WAVE-U-NET COMBINED WITH DISCRETE WAVELET TRANSFORM
         Tomohiko Nakamura; University of Tokyo
         Hiroshi Saruwatari; University of Tokyo