Caching features in GPU to accelerate the speed for sampling based GNN training

Hello, I want to cache the node features in the available GPU memory before training. But for sampling-based training, how can I know which nodes are going to be sampled? And Is it possible to get the list of nodes in the sampled subgraph before training? My intention is to reduce data movement costs.

This is non-trivial. It ultimately depends on the sampling algorithm and graph data. Even with those fixed, the access pattern can have poor locality for caching. One simple idea is to cache features of high-degree nodes since they are more likely to be accessed. There are a couple of papers on this topic. One example is GNNLab.