MP8b.11

Autoencoder based optimized SSL representations: Complexity Minimization and improved Dysarthric ASR

Paban Sapkota, Hemant Kumar Kathania, National Institute of Technology Sikkim, India; Mikko Kurimo, Aalto University, Finland; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States

Session:
MP8b: Speech, Image, and Video Analysis Poster

Track:
Speech, Image and Video Processing

Location:
Merrill Hall

Presentation Time:
Mon, 27 Oct, 10:15 - 11:55 PT (UTC -7)

Presentation
Discussion
Resources
No resources available.
Session MP8b
MP8b.1: High-Throughput and Small-Area Huffman Decoder for Baseline JPEG Using Small Memory Arrays
Yechengnuo Zhang, Derek Li, Bevan Baas, University of California, Davis, United States
MP8b.2: History-Augmented Vision-Language Models for Frontier-Based Zero-Shot Object Navigation
mobin habibpour, Fatemeh Afghah, Clemson university, United States
MP8b.3: Intonation: inaudible sounds that normies hallucinate
Kevin Moreno, Eric Freudenthal, University of Texas at El Paso, United Republic of Tanzania
MP8b.4: DSSCNet: A Transfer Learning Framework for Cross-Corpus Dysarthric Speech Severity Classification
Arnab Kumar Roy, Sikkim Manipal Institute of Technology, India; Hemant Kumar Kathania, Paban Sapkota, National Institute of Technology Sikkim, India; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States
MP8b.5: Efficient No-Reference Video Quality Assessment Using Video Masked Autoencoder Feature Mixing
Suresh N, Sumohana S Channappayya, IIT Hyderabad, India
MP8b.6: A REVIEW OF CONNECTIONIST TEMPORAL CLASSIFICATION AND TRANSDUCER LOSSES WITH THE GRADIENTS USING ESTIMATED LABELS (GEL) ALGORITHM
Chanwoo Kim, Korea University, Republic of Korea
MP8b.7: Triplane Learning for Event Stream Representation
Anustup Choudhury, Guan-Ming Su, Dolby Laboratories Inc., United States; Jingxi Chen, University of Maryland, College Park, United States
MP8b.8: Memory-Efficient Keyword Spotting
Sanghyeon Ju, Pohang University of Science and Technology, Republic of Korea
MP8b.9: Prompt Conditioned Vision-Language Models for Detecting Novel Backdoor Images
Kyle Stein, The University of West Florida, United States; Andrew Arash Mahyari, Institute for Human and Machine Cognition, United States; Guillermo Francia, Eman El-Sheikh, University of West Florida, United States
MP8b.10: Leveraging Features from Self-Supervised Learning Models for Zero-Shot Children's Speech KWS
Subham Kutum, Abhijit Sinha, Hemant Kumar Kathania, National Institute of Technology, Sikkim, India; Sudarsana Reddy Kadiri, Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, USA, United States; Mahesh Chandra Govil, National Institute of Technology, Sikkim, India
MP8b.11: Autoencoder based optimized SSL representations: Complexity Minimization and improved Dysarthric ASR
Paban Sapkota, Hemant Kumar Kathania, National Institute of Technology Sikkim, India; Mikko Kurimo, Aalto University, Finland; Sudarsana Reddy Kadiri, Shrikanth Narayanan, University of Southern California, United States
MP8b.12: FABLE: Florence-Guided Annotation of Bounding Boxes for LEarning Videos
Jonathan McGee, Lisa S. Berlizova, Soumee Guha, Peter Youngs, Scott T. Acton, University of Virginia, United States
Contacts