Paper ID | D-1-2.4 |
Paper Title |
Learning Dense Correspondences via Local and Non-local Feature Fusion |
Authors |
Wen-Chi Chin, National Tsing Hua University, Taiwan; Zih-Jian Jhang, Yan-Hao Huang, Industrial Technology Research Institute, Taiwan; Koichi Ito, Tohoku University, Japan; Hwann-Tzong Chen, National Tsing Hua University, Taiwan |
Session |
D-1-2: Machine Learning Techniques for Image & Video |
Time | Tuesday, 08 December, 15:30 - 17:00 |
Presentation Time: | Tuesday, 08 December, 16:15 - 16:30 Check your Time Zone |
|
All times are in New Zealand Time (UTC +13) |
Topic |
Image, Video, and Multimedia (IVM): |
Abstract |
We present a learning-based method for extracting distinctive features on video objects. From the extracted features, we are able to derive dense correspondences between the objects in the current video frame and in the reference template. We train a deep-learning model with non-local blocks to predict dense feature maps for long-range dependencies. A new video object correspondence dataset is introduced for training and for evaluation. Further, we propose a new feature-aggregation technique that is based on the optical flow of consecutive frames and we apply it to the integration of multiple feature maps for alleviating uncertainties. We also use the local information provided by optical flow to evaluate the reliability of feature matching. The experimental results show that our local and non-local fusion approach can reduce unreliable correspondences and thus improve the matching accuracy. |