Asilomar 2024 || Pacific Grove, California || October 27

TP2b: Perceptual and Higher Level Loss and Distance Functions for Audio and Acoustics

Tue, 29 Oct, 15:30 - 17:35 PT (UTC -7)

Location: Acacia

Session Type: Lecture

Session Co-Chairs: Gerald Schuller, Ilmenau University of Technology and Muhammad Imran, Ilmenau University of Technology

Track: Speech, Image and Video Processing

Tue, 29 Oct, 15:30 - 15:55 PT (UTC -7)

TP2b.1: Similarity Metrics for Late Reverberation

Gloria Dal Santo, Prawda Karolina, Aalto University, Finland; Sebastian Jiro Schlecht, Friedrich-Alexander-Universitat Erlangen-Nurnberg (FAU), Germany; Vesa Välimäki, Aalto University, Finland

Tue, 29 Oct, 15:55 - 16:20 PT (UTC -7)

TP2b.2: Time-Frequency Audio Similarity using Optimal Transport

Linda Fabiani, Ecole Polytechnique Fédérale de Lausanne, Switzerland; Sebastian J. Schlecht, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; Filip Elvander, Aalto University, Finland

Tue, 29 Oct, 16:20 - 16:45 PT (UTC -7)

TP2b.3: A Novel Perceptual Loss Function for Audio and Music Quality Differentiable in PyTorch

Gerald Schuller, Ilmenau University of Technology, Germany, Germany

Tue, 29 Oct, 16:45 - 17:10 PT (UTC -7)

TP2b.4: Instrumental Timbre Transfer Based on Disentangled Representation of Timbre and Pitch

Lin Ye, Gerald Schuller, Muhammad Imran, Ilmenau University of Technology, Germany, Germany

Tue, 29 Oct, 17:10 - 17:35 PT (UTC -7)

TP2b.5: Pruning-aware Loss Functions for STOI-Optimized Pruned Recurrent Autoencoders for the Compression of the Stimulation Patterns of Cochlear Implants at Zero Delay

Reemt Hinrichs, Jörn Ostermann, Leibniz University Hannover, Germany