Hello, I have a question related to distributed GNN training using DistDGL, and I hope you may give me some advice if I was wrong or not.
- My understanding of distributed GNN training: To my knowledge, after partitioning the graph, each worker should have its own partition with all its k-hop neighbors, where k is the num of GNN layers, together with its own partition’s node/edge embeddings. And then during training, workers use
DistTensor
to request embeddings from other hosts according to their own partitions. - Thus, I wonder why does the partition example not assign
num_hops
according to the GNNnum_layers
but use the default value 1, while thenum_layers
in training scripts is by default 2. I wonder if this is correct for training. - Besides, if I change the
num_layers
in the training scripts, is the training process still correct in this example?