Technical Program

Paper Detail

Paper ID	C-3-2.6
Paper Title	DETECTING OBJECT SURFACE KEYPOINTS FROM A SINGLE RGB IMAGE VIA DEEP LEARNING NETWORK FOR 6DOF POSE ESTIMATION
Authors	Wen-Nung Lie, Lee Aing, National Chung Cheng University, Taiwan
Session	C-3-2: Machine Learning and Data Analysis 2
Time	Thursday, 10 December, 15:30 - 17:15
Presentation Time:	Thursday, 10 December, 16:45 - 17:00 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Machine Learning and Data Analytics (MLDA):
Abstract	Estimating the 6DoF object pose from a single RGB image is one of the challenging tasks in computer vision. Before the pose parameters can be defined by traditional PnP algorithm, 2D image projections of a set of 3D object keypoints have to be accurately detected. In this paper, we present techniques for defining 3D object keypoints and predicting their corresponding 2D counterparts via deep-learning network architectures. The main technique to designate object keypoints is firstly to employ k-means clustering for calculating the object surface weights and then select from all surface points the ones mostly distributive with larger surface weights to describe the object shape as possible. Moreover, Robust loss function is adopted in training the ResNet18 network for predicting image projection of object keypoints by focusing on small scale errors. Experimental results show that our proposed technique outperforms state-of-the-art approaches in ratio of correctness in both “2D projection” and “3D transformation” metrics.