Neighbourhood Sampling Overhead

Hi,

I am trying neighbourhood sampling with GCN on OGBL-PPA Dataset for a link prediction task. I have implemented a 4-layer full batch GCN model on a 11 GB GPU. I was able to fit a model of 50K parameters onto the GPU. The model runs about 42 epochs in 12 hours of training.

On the other hand, when I have implemented neighbourhood sampling for the same model, I was able fit a significantly larger model of 100K parameters but the model only able to run 3 epochs in 12 hours of training time. I have followed the implementation as described here.

We can see a significant increase in training time for each epoch. Why is that the case? What is causing the overhead that is significantly increasing the training time?

Neighbor sampling indeed will take longer time especially for models that are light in GPU computation such as GCN, since neighbor sampling itself will take time (often longer than the time spent on GCN computation itself since it is done with CPU). It is also related to the batch size, the number of neighbors you select, whether you are sampling with replacement, etc.

1 Like

Another possibility is that copying data from CPU to GPU also takes time. You can potentially save the time by putting the features on GPU. Node classification with GraphSAGE (dgl/train_sampling.py at master · dmlc/dgl · GitHub) and link prediction with GraphSAGE (dgl/train_sampling_unsupervised.py at master · dmlc/dgl · GitHub) are two examples.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.