WP-L.B.2

SYNCHRONIZED AUDIO-VISUAL FRAMES WITH FRACTIONAL POSITIONAL ENCODING FOR TRANSFORMERS IN VIDEO-TO-TEXT TRANSLATION

Philipp Harzig, Moritz Einfalt, Rainer Lienhart, University of Augsburg, Germany

Session:
Machine Learning for 3D Processing
Lecture

Track:
Applications of Machine Learning

Location:
Room B

Presentation Time:
Wed, 19 Oct, 22:45 - 23:00 China Standard Time (UTC +8)
Wed, 19 Oct, 16:45 - 17:00 Central European Time (UTC +1)
Wed, 19 Oct, 14:45 - 15:00 UTC
Wed, 19 Oct, 10:45 - 11:00 Eastern Time (UTC -5)

Session Co-Chairs:
Yoshinari Kameda, University of Tsukuba and Changjae Oh, Queen Mary University of London
Presentation
Discussion
Resources
No resources available.