IEEE ISIT 2024 || Athens, Greece || 7-12 July 2024

FR4.R7.4

The Entrapment Problem in Random Walk Decentralized Learning

Zonghong Liu, Salim El Rouayheb, Rutgers University, United States; Matthew Dwyer, DEVCOM Army Research Laboratory, United States

Session:

Distributed Learning

Track:

15: Distributed and Federated Learning

Location:

VIP

Presentation Time:

Fri, 12 Jul, 17:25 - 17:45

Session Chair:

Randall Berry, Northwestern University

Abstract

We consider a decentralized learning setting in which data is distributed over the nodes of a graph. The goal is to learn a global model of the distributed data without the standard reliance on a central server to perform aggregation. We study a decentralized SGD algorithm in which a random walk on the network carries a global model. The model is updated based on the local data of the current node being visited. Our focus is on the design of the transition probability matrix of the random walk to speed up the convergence. In centralized learning, implementing importance sampling can sometimes speed up the convergence. It has been shown in the literature that one can mimic centralized importance sampling in a decentralized setting by designing a transition probability to achieve a desired stationary distribution using the Metropolis-Hastings (MH) algorithm. This paper identifies a drawback of such an MH-based algorithm: When the network is not well-connected, such as the ring network, and for certain data configurations, importance sampling via MH will force the random walk to be entrapped at certain nodes, slowing down the convergence. We refer to this phenomenon as the entrapment problem. We propose a new algorithm, the Metropolis-Hastings with Levy Jumps (MHLJ), to overcome the entrapment problem in decentralized importance sampling. The MHLJ algorithm speeds up the convergence by randomly pushing the random walk out of some local region, thus overcoming the entrapment. We prove the convergence rate of the MHLJ algorithm and confirm the error gap brought by adding the Levy random jumps. We also verify our theoretical results using numerical experiments.

Session FR4.R7

FR4.R7.1: DIST-CURE: A Robust Distributed Learning Algorithm with Cubic Regularized Newton

Avishek Ghosh, Indian Institute of Technology, Bombay, India; Raj Kumar Maity, University of Massachusetts Amherst, United States; Arya Mazumdar, University of California, San Diego, United States

FR4.R7.2: SignSGD-FV: Communication-Efficient Distributed Learning through Heterogeneous Edges

Chanho Park, POSTECH, Korea (South); H. Vincent Poor, Princeton University, United States; Namyoon Lee, Korea University, Korea (South)

FR4.R7.3: Distributed Learning for Dynamic Congestion Games

Hongbo Li, Lingjie Duan, Singapore University of Technology and Design, Singapore

FR4.R7.4: The Entrapment Problem in Random Walk Decentralized Learning

Zonghong Liu, Salim El Rouayheb, Rutgers University, United States; Matthew Dwyer, DEVCOM Army Research Laboratory, United States

Resources

View Manuscript