In DistDGL[1], I am curious about whether target training nodes are local shuffling (regardless of computation load balance) ?
If this is the case, would the training accuracy be affected considering the training sequence is not fully randomized? A similar paper PaGraph[2] said local shuffling would slow convergence.
[1] DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs, Zheng, Da and Ma, Chao and Wang, Minjie and Zhou, Jinjing and Su, Qidong and Song, Xiang and Gan, Quan and Zhang, Zheng and Karypis, George
[2] PaGraph: Scaling GNN Training on Large Graphs via
Computation-aware Caching