| Paper ID | MMSP-L1.5 |
| Paper Title |
TRILINGUAL SEMANTIC EMBEDDINGS OF VISUALLY GROUNDED SPEECH WITH SELF-ATTENTION MECHANISMS |
| Authors |
Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, NTT Corporation, Japan; David Harwath, James Glass, Massachusetts Institute of Technology, United States |
| Session | MMSP-L1: Signal Processing for Multimedia Applications II |
| Location | On-Demand |
| Session Time: | Friday, 08 May, 08:00 - 10:00 |
| Presentation Time: | Friday, 08 May, 09:20 - 09:40 |
| Presentation |
Lecture
|
| Topic |
Multimedia Signal Processing: Signal Processing for Multimedia Applications |
| IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
| Virtual Presentation |
Click here to watch in the Virtual Conference |