MO4.R2.1

Conditional Mutual Information Constrained Deep Learning: Framework and Preliminary Results

En-Hui Yang, Shayan Mohajer Hamidi, Linfeng Ye, Renhao Tan, University of Waterloo, Canada; Beverly Yang, University of British Columbia, Canada

Session:
Topics in Machine Learning 2

Track:
8: Machine Learning

Location:
Ypsilon I-II-III

Presentation Time:
Mon, 8 Jul, 16:25 - 16:45

Session Chair:
Lalitha Sankar, Arizona State University
Abstract
In this paper, we introduce the notions of conditional mutual information (CMI) and normalized conditional mutual information (NCMI) for classification deep neural networks (DNNs). In particular, CMI and the ratio between CMI and NCMI quantify the intra-class concentration and inter-class separation of a DNN in its output probability distribution space, respectively. Utilizing NCMI to assess widely recognized DNNs pre-trained on ImageNet reveals a notable inverse relationship between their validation accuracies and NCMI values on the ImageNet validation dataset. Building upon this insight, the conventional deep learning (DL) framework is modified by minimizing the standard cross-entropy function while imposing an NCMI constraint. This refinement results in a novel approach known as CMI-constrained deep learning (CMIC-DL). Comprehensive experimental findings demonstrate that DNNs trained using CMIC-DL outperform state-of-the-art models trained within standard DL and other loss functions in the literature. In addition, some semantic meaning of CMI is also discovered.
Resources