Paper ID | MMSP-L1.5 |
Paper Title |
TRILINGUAL SEMANTIC EMBEDDINGS OF VISUALLY GROUNDED SPEECH WITH SELF-ATTENTION MECHANISMS |
Authors |
Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, NTT Corporation, Japan; David Harwath, James Glass, Massachusetts Institute of Technology, United States |
Session | MMSP-L1: Signal Processing for Multimedia Applications II |
Location | On-Demand |
Session Time: | Friday, 08 May, 08:00 - 10:00 |
Presentation Time: | Friday, 08 May, 09:20 - 09:40 |
Presentation |
Lecture
|
Topic |
Multimedia Signal Processing: Signal Processing for Multimedia Applications |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |