Disabling sampling in distributed GraphSAGE


I am currently experimenting with the distributed GraphSAGE code provided by DGL. In order to compare with some other systems that do not do sampling, is it possible to easily change the code or pass in some argument so that the neighborhood sampling done in epochs is disabled?

In other words, I want an epoch in DGL’s GraphSAGE to only do 1 forward/backward step using the entire training set.

Loc Hoang

Sure. You just need to use the whole graph as the input when calculating the logits = model(g, input), g is sampled in the example, and you use the whole graph to substitute the g here.

And why you want to use distributed version if you can load the whole graph into the memory?

You can also refer to our inference example on the full graph https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/experimental/train_dist.py#L84

Thanks for the information!

I’m doing distribution to improve compute time rather than to get more memory since the computation in a GNN is quite high.

If I have any more questions I’ll let you know.

  • Loc