Technical Program

Paper Detail

Paper ID	E-2-1.2
Paper Title	SPECTRAL FEATURES AND PITCH HISTOGRAM FOR AUTOMATIC SINGING QUALITY EVALUATION WITH CRNN
Authors	Lin Huang, Chitralekha Gupta, Haizhou Li, National University of Singapore, Singapore
Session	E-2-1: Music Information Processing 2, Voice Conversion
Time	Wednesday, 09 December, 12:30 - 14:00
Presentation Time:	Wednesday, 09 December, 12:45 - 13:00 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Speech, Language, and Audio (SLA):
Abstract	Deep neural networks (DNNs) have been applied successfully to music information retrieval (MIR). In this paper, we design a convolutional recurrent neural network (CRNN) for automatic singing quality evaluation, and present a comparative study over various acoustic features as network input. We optimize the CRNN so that the machine-predicted scores are closer to the human-annotated scores. Furthermore, we augment spectral features with pitch histogram (a musically-motivated representation) as network input. The experiments show that our proposed CRNN framework can learn the underlying discerning properties of singing quality effectively. Moreover, explicit incorporation of pitch histogram further improves system performance, and reduces the system’s dependency on song content.