Federated Learning (FL) is envisioned as the cornerstone of the next-generation mobile system, whereby integrating FL into the network edge elements (i.e., user terminals and edge/cloud servers), it is expected to unleash the potential of network intelligence by learning from the massive amount of users' data while concurrently preserving privacy. In this paper, we develop an analytical framework that quantifies the interplay of user mobility, a fundamental property of mobile networks, and data heterogeneity, the salient feature of FL, on the model training efficiency. Specifically, we derive the convergence rate of a hierarchical FL system operated in a mobile network, showing how user mobility amplifies the divergence caused by data heterogeneity. The theoretical findings are corroborated by experimental simulations.