Offline Inference Problem on Large Graphs

In Chapter 6.6, it says “The inference algorithm is different from the training algorithm, as the representations of all nodes should be computed layer by layer, starting from the first layer. The consequence is that the inference algorithm will have an outer loop iterating over the layers, and an inner loop iterating over the minibatches of nodes. In contrast, the training algorithm has an outer loop iterating over the minibatches of nodes, and an inner loop iterating over the layers for both neighborhood sampling and message passing.”

I was wondering why the inference algorithm should be different from the training algorithm? If the inference procedure goes the same as training, what are the disadvantages?

As said in the documentation:

When performing inference it is usually better to truly aggregate over all neighbors instead to get rid of the randomness introduced by sampling. However, full-graph forward propagation is usually infeasible on GPU due to limited memory, and slow on CPU due to slow computation.

Does that address your question?

Thank you for your replying.

If I choose to use MultiLayerFullNeighborSampler instead of MultiLayerNeighborSampler during inference, can I still follow the training procedure (i.e., outer loop for iterating over the minibatches of nodes, and an inner loop for iterating over the layers for both full neighborhood sampling and message passing)?

You could, but usually it will cost significantly more (GPU) memory if your graph is large (millions of nodes) and your model is deep that way. You’ll have to use a small batch size if that’s the case.

Thank you. I just wanna make sure that despite the efficiency problem the logic behind that is reasonable.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.