We consider a decentralized learning setting in which data is distributed over the nodes of a graph. The goal is to learn a global model of the distributed data without the standard reliance on a central server to perform aggregation. We study a decentralized SGD algorithm in which a random walk on the network carries a global model. The model is updated based on the local data of the current node being visited. Our focus is on the design of the transition probability matrix of the random walk to speed up the convergence. In centralized learning, implementing importance sampling can sometimes speed up the convergence. It has been shown in the literature that one can mimic centralized importance sampling in a decentralized setting by designing a transition probability to achieve a desired stationary distribution using the Metropolis-Hastings (MH) algorithm. This paper identifies a drawback of such an MH-based algorithm: When the network is not well-connected, such as the ring network, and for certain data configurations, importance sampling via MH will force the random walk to be entrapped at certain nodes, slowing down the convergence. We refer to this phenomenon as the entrapment problem. We propose a new algorithm, the Metropolis-Hastings with Levy Jumps (MHLJ), to overcome the entrapment problem in decentralized importance sampling. The MHLJ algorithm speeds up the convergence by randomly pushing the random walk out of some local region, thus overcoming the entrapment. We prove the convergence rate of the MHLJ algorithm and confirm the error gap brought by adding the Levy random jumps. We also verify our theoretical results using numerical experiments.