AUD-P2: Deep Learning for Speech and Audio |
Session Type: Poster |
Time: Tuesday, 5 May, 11:30 - 13:30 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chair: Shoko Araki, NTT |
AUD-P2.1: WAWENETS: A NO-REFERENCE CONVOLUTIONAL WAVEFORM-BASED APPROACH TO ESTIMATING NARROWBAND AND WIDEBAND SPEECH QUALITY |
Andrew Catellier; Institute for Telecommunication Sciences |
Stephen Voran; Institute for Telecommunication Sciences |
AUD-P2.2: A NEURAL NETWORK FOR MONAURAL INTRUSIVE SPEECH INTELLIGIBILITY PREDICTION |
Mathias Bach Pedersen; Aalborg University |
Asger Heidemann Andersen; Oticon A/S |
Søren Holdt Jensen; Aalborg University |
Jesper Jensen; Aalborg University |
AUD-P2.3: SOURCE CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL |
Roy Fejgin; Dolby Laboratories |
Janusz Klejsa; Dolby Sweden AB |
Lars Villemoes; Dolby Sweden AB |
Cong Zhou; Dolby Laboratories |
AUD-P2.4: FULL-REFERENCE SPEECH QUALITY ESTIMATION WITH ATTENTIONAL SIAMESE NEURAL NETWORKS |
Gabriel Mittag; Technische Universität Berlin |
Sebastian Möller; Technische Universität Berlin |
AUD-P2.5: ENHANCED METHOD OF AUDIO CODING USING CNN-BASED SPECTRAL RECOVERY WITH ADAPTIVE STRUCTURE |
Seong-Hyeon Shin; Kwangwoon University |
Seung Kwon Beack; Electronics and Telecommunications Research Institute (ETRI) |
Wootaek Lim; Electronics and Telecommunications Research Institute (ETRI) |
Hochong Park; Kwangwoon University |
AUD-P2.6: AUDIO CODEC ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS |
Arijit Biswas; Dolby Germany GmbH |
Dai Jia; Dolby Laboratories |
AUD-P2.7: EFFICIENT AND SCALABLE NEURAL RESIDUAL WAVEFORM CODING WITH COLLABORATIVE QUANTIZATION |
Kai Zhen; Indiana University |
Mi Suk Lee; Electronics and Telecommunications Research Institute (ETRI) |
Jongmo Sung; Electronics and Telecommunications Research Institute (ETRI) |
Seungkwon Beack; Electronics and Telecommunications Research Institute (ETRI) |
Minje Kim; Indiana University |
AUD-P2.8: A DUAL-STAGED CONTEXT AGGREGATION METHOD TOWARDS EFFICIENT END-TO-END SPEECH ENHANCEMENT |
Kai Zhen; Indiana University |
Mi Suk Lee; Electronics and Telecommunications Research Institute (ETRI) |
Minje Kim; Indiana University |
AUD-P2.9: A RECURRENT VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT |
Simon Leglaive; CentraleSupélec, IETR |
Xavier Alameda-Pineda; Inria Grenoble Rhone-Alpes |
Laurent Girin; Univ. Grenoble Alpes, Grenoble INP, GIPSA-lab |
Radu Horaud; Inria Grenoble Rhone-Alpes |
AUD-P2.10: SPEAKERFILTER: DEEP LEARNING-BASED TARGET SPEAKER EXTRACTION USING ANCHOR SPEECH |
ShuLin He; Inner Mongolia University |
Hao Li; Inner Mongolia University |
XueLiang Zhang; Inner Mongolia University |
AUD-P2.11: TACKLING REAL NOISY REVERBERANT MEETINGS WITH ALL-NEURAL SOURCE SEPARATION, COUNTING, AND DIARIZATION SYSTEM |
Keisuke Kinoshita; NTT Corporation |
Marc Delcroix; NTT Corporation |
Shoko Araki; NTT Corporation |
Tomohiro Nakatani; NTT Corporation |
AUD-P2.12: TIME-DOMAIN AUDIO SOURCE SEPARATION BASED ON WAVE-U-NET COMBINED WITH DISCRETE WAVELET TRANSFORM |
Tomohiko Nakamura; University of Tokyo |
Hiroshi Saruwatari; University of Tokyo |