Technical Program

Paper Detail

Paper ID	D-2-2.3
Paper Title	SCENE TEXT-LINE EXTRACTION WITH FULLY CONVOLUTIONAL NETWORK AND REFINED PROPOSALS
Authors	Guan-Xin Zeng, Yu-Hong Hou, Po-Chyi Su, National Central University, Taiwan; Li-Wei Kang, National Taiwan Normal University, Taiwan
Session	D-2-2: Recent Advances in Deep Learning with Multimedia Applications
Time	Wednesday, 09 December, 15:30 - 17:00
Presentation Time:	Wednesday, 09 December, 16:00 - 16:15 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Image, Video, and Multimedia (IVM): Special Session: Recent Advances in Deep Learning with Multimedia Applications
Abstract	Texts appearing in images are often regions of interest. Locating such areas for further analysis can help to extract image-related information and facilitate many applications. Pixel-based segmentation and region-based object classification are two methodologies for identifying text areas in images and have their own pros and cons. In this research, a text detection scheme consisting of a pixel-based classification network and a supplemented region proposal network is proposed. The main network is a Fully Convolutional Network (FCN) employing Feature Pyramid Networks (FPN) and Atrous Spatial Pyramid Pooling (ASPP) to indicate possible text areas and text borders with high recall. Certain areas are further processed by the refinement network, i.e., a simplified Connectionist Text Proposal Network (CTPN) with high precision. Non-Maximum Suppression (NMS) is then applied to form appropriate text-lines. The experimental results show feasibility of the scheme.