IEEE ISIT 2024 || Athens, Greece || 7-12 July 2024

MO4.R2.2

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Tina Behnia, Christos Thrampoulidis, University of British Columbia, Canada

Session:

Topics in Machine Learning 2

Track:

8: Machine Learning

Location:

Ypsilon I-II-III

Presentation Time:

Mon, 8 Jul, 16:45 - 17:05

Session Chair:

Lalitha Sankar, Arizona State University

Abstract

Recent findings reveal that over-parameterized deep neural networks, trained beyond zero training-error, exhibit a distinctive structural pattern at the final layer, termed as Neural-collapse (NC). These results indicate that the final hidden-layer outputs in such networks display minimal within-class variations over the training set. While existing research extensively investigates this phenomenon under cross-entropy loss, there are fewer studies focusing on its contrastive counterpart, supervised contrastive (SC) loss. Through the lens of NC, this paper employs an analytical approach to study the solutions derived from optimizing the SC loss. We adopt the unconstrained features model (UFM) as a representative proxy for unveiling NC-related phenomena in sufficiently over-parameterized deep networks. We show that, despite the non-convexity of SC loss minimization, all local minima are global minima. Furthermore, the minimizer is unique (up to a rotation). We prove our results by formalizing a tight convex relaxation of the UFM. Finally, through this convex formulation, we delve deeper into characterizing the properties of global solutions under label-imbalanced training data.

Session MO4.R2

MO4.R2.1: Conditional Mutual Information Constrained Deep Learning: Framework and Preliminary Results

En-Hui Yang, Shayan Mohajer Hamidi, Linfeng Ye, Renhao Tan, University of Waterloo, Canada; Beverly Yang, University of British Columbia, Canada

MO4.R2.2: Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Tina Behnia, Christos Thrampoulidis, University of British Columbia, Canada

MO4.R2.3: Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Monica Welfert, Nathan Stromberg, Lalitha Sankar, Arizona State University, United States

MO4.R2.4: Robust VAEs via Generating Process of Noise Augmented Data

Hiroo Irobe, Wataru Aoki, Tokyo Institute of Technology, Japan; Kimihiro Yamazaki, Fujitsu, Japan; Yuhui Zhang, Tokyo Institute of Technology, Japan; Takumi Nakagawa, Tokyo Institute of Technology, RIKEN AIP, Japan; Hiroki Waida, Tokyo Institute of Technology, Japan; Yuichiro Wada, Fujitsu, RIKEN AIP, Japan; Takafumi Kanamori, Tokyo Institute of Technology, RIKEN AIP, Japan

Resources

View Manuscript