Technical Program

Paper Detail

Paper IDA-2-3.5
Paper Title A PARALLELIZATION METHOD OF INCEPTION ARCHITECTURE BASED ON ARRAY PROCESSOR
Authors Xiaoyan Xie, Zhuolin Du, Chuanzhan Hu, Kun Yang, Anqi Wang, Xi’an University of Posts and Telecommunications, China
Session A-2-3: Reconfigurable Computing and Performance Evaluation
TimeWednesday, 09 December, 17:15 - 19:15
Presentation Time:Wednesday, 09 December, 18:15 - 18:30 Check your Time Zone
All times are in New Zealand Time (UTC +13)
Topic Signal Processing Systems: Design and Implementation (SPS): Special Session: Reconfigurable Computing and Performance Evaluation
Abstract The Inception architecture proposed to GoogLeNet has the characteristics of few parameters, strong expression ability, and low degree of overfitting, which makes it possible for deployment of convolutional neural network (CNN) in mobile or embedded terminals with limited resources. In order to support the parallelling reconfigurable process of 28 × 28 and 32 × 32 image recognition with 1 × 1, 3 × 3, and 5 × 5 convolution kernels on a 4 × 4 processing elements (PE) array, this paper converts the input image into a dimensional arrays, reducing the frequency of hardware accessed during the convolution calculating. By analyzing the data dependency among convolution and pooling operations in the network, an overlapping window data reuse scheme is proposed, which reduces the number of pixels loaded by external memory by 30%. On the array processor (DPR-CODEC) platform developed by the project team, the proposed method was verified with Minist and Cifar-10 functional testing data sets. The experimental results show that, at the operating frequency of 123MHz, compared with the scheme without preprocessing, the preprocessing hardware access overhead is reduced to 45%, the data reuse rate of convolution calculation reaches 66.7%, and the operating power consumption is 6.395 W, the power per watt is 0.176, and the performance is significantly improved compared to the FPGA version.