Asilomar 2025 || Pacific Grove, California || October 26

MA8b2: Speech, Image, and Video Analysis

Mon, 27 Oct, 10:15 - 11:55 PT (UTC -7)

Location: Merrill Hall

Session Type: Poster

Session Chair: Salman Asif, University of California, Riverside

Track: Speech, Image and Video Processing

MA8b2.1: High-Throughput and Small-Area Huffman Decoder for Baseline JPEG Using Small Memory Arrays

Yechengnuo Zhang, Derek Li, Bevan Baas, University of California, Davis, United States

MA8b2.2: History-Augmented Vision-Language Models for Frontier-Based Zero-Shot Object Navigation

mobin habibpour, Fatemeh Afghah, Clemson University, United States

MA8b2.3: Intonation: Barely Audible Sounds That Normies Hallucinate

Kevin Moreno, Eric Freudenthal, University of Texas at El Paso, United Republic of Tanzania

MA8b2.4: DSSCNet: A Transfer Learning Framework for Cross-Corpus Dysarthric Speech Severity Classification

Arnab Kumar Roy, Sikkim Manipal Institute of Technology, India; Hemant Kumar Kathania, Paban Sapkota, National Institute of Technology, Sikkim, India; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States

MA8b2.5: Efficient No-Reference Video Quality Assessment Using Video Masked Autoencoder Feature Mixing

Suresh N, Sumohana S Channappayya, IIT Hyderabad, India

MA8b2.6: A Review of Connectionist Temporal Classification and Transducer Losses with the Gradients using Estimated Labels (GEL) Algorithm

Chanwoo Kim, Korea University, Republic of Korea

MA8b2.7: Triplane Learning for Event Stream Representation

Anustup Choudhury, Guan-Ming Su, Dolby Laboratories Inc., United States; Jingxi Chen, University of Maryland, College Park, United States

MA8b2.8: Memory-Efficient Keyword Spotting

Sanghyeon Ju, Sunggu Lee, Pohang University of Science and Technology, Republic of Korea

MA8b2.9: Prompt-Conditioned Vision-Language Models for Detecting Unseen Backdoor Images

Kyle Stein, University of West Florida, United States; Andrew Arash Mahyari, Institute for Human and Machine Cognition, United States; Guillermo Francia, Eman El-Sheikh, University of West Florida, United States

MA8b2.10: Leveraging Features from Self-Supervised Learning Models for Zero-Shot Children's Speech KWS

Subham Kutum, Abhijit Sinha, Hemant Kumar Kathania, National Institute of Technology, Sikkim, India; Sudarsana Reddy Kadiri, Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, USA, United States; Mahesh Chandra Govil, National Institute of Technology, Sikkim, India

MA8b2.11: Autoencoder Based Optimized SSL Representations: Complexity Minimization and Improved Dysarthric ASR

Paban Sapkota, Hemant Kumar Kathania, National Institute of Technology, Sikkim, India; Mikko Kurimo, Aalto University, Finland; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States

MA8b2.12: FABLE: Florence-2–Assisted Behavioral Learning and Embedding for Multilabel Action Recognition

Jonathan McGee, Lisa S. Berlizova, Soumee Guha, Peter Youngs, Scott T. Acton, University of Virginia, United States