MA8b2.2

History-Augmented Vision-Language Models for Frontier-Based Zero-Shot Object Navigation

mobin habibpour, Fatemeh Afghah, Clemson University, United States

Session:
MA8b2: Speech, Image, and Video Analysis Poster

Track:
Speech, Image and Video Processing

Location:
Merrill Hall

Presentation Time:
Mon, 27 Oct, 10:15 - 11:55 PT (UTC -7)

Presentation
Discussion
Resources
No resources available.
Session MA8b2
MA8b2.1: High-Throughput and Small-Area Huffman Decoder for Baseline JPEG Using Small Memory Arrays
Yechengnuo Zhang, Derek Li, Bevan Baas, University of California, Davis, United States
MA8b2.2: History-Augmented Vision-Language Models for Frontier-Based Zero-Shot Object Navigation
mobin habibpour, Fatemeh Afghah, Clemson University, United States
MA8b2.3: Intonation: Inaudible Sounds That Normies Hallucinate
Kevin Moreno, Eric Freudenthal, University of Texas at El Paso, United Republic of Tanzania
MA8b2.4: DSSCNet: A Transfer Learning Framework for Cross-Corpus Dysarthric Speech Severity Classification
Arnab Kumar Roy, Sikkim Manipal Institute of Technology, India; Hemant Kumar Kathania, Paban Sapkota, National Institute of Technology, Sikkim, India; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States
MA8b2.5: Efficient No-Reference Video Quality Assessment Using Video Masked Autoencoder Feature Mixing
Suresh N, Sumohana S Channappayya, IIT Hyderabad, India
MA8b2.6: A Review of Connectionist Temporal Classification and Transducer Losses with the Gradients using Estimated Labels (GEL) Algorithm
Chanwoo Kim, Korea University, Republic of Korea
MA8b2.7: Triplane Learning for Event Stream Representation
Anustup Choudhury, Guan-Ming Su, Dolby Laboratories Inc., United States; Jingxi Chen, University of Maryland, College Park, United States
MA8b2.8: Memory-Efficient Keyword Spotting
Sanghyeon Ju, Sunggu Lee, Pohang University of Science and Technology, Republic of Korea
MA8b2.9: Prompt Conditioned Vision-Language Models for Detecting Novel Backdoor Images
Kyle Stein, University of West Florida, United States; Andrew Arash Mahyari, Institute for Human and Machine Cognition, United States; Guillermo Francia, Eman El-Sheikh, University of West Florida, United States
MA8b2.10: Leveraging Features from Self-Supervised Learning Models for Zero-Shot Children's Speech KWS
Subham Kutum, Abhijit Sinha, Hemant Kumar Kathania, National Institute of Technology, Sikkim, India; Sudarsana Reddy Kadiri, Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, USA, United States; Mahesh Chandra Govil, National Institute of Technology, Sikkim, India
MA8b2.11: Autoencoder Based Optimized SSL Representations: Complexity Minimization and Improved Dysarthric ASR
Paban Sapkota, Hemant Kumar Kathania, National Institute of Technology, Sikkim, India; Mikko Kurimo, Aalto University, Finland; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States
MA8b2.12: FABLE: Florence-guided Annotation of Bounding Boxes for Learning Videos
Jonathan McGee, Lisa S. Berlizova, Soumee Guha, Peter Youngs, Scott T. Acton, University of Virginia, United States
Contacts