In a stochastic edge prediction training setting, I realized a different set of neighbors is sampled for each edge due to stochasticity, i.e., the g.sample_neighbors
method returns different neighbors each time it is run. And this is causing quite a bit of variance in the evaluation result since some regions in my graph have high connectivity.
- Is there a way to fix the neighbors sampled by the node sampler? I tried to play around with seed in vein.
- If 1) is impossible, what’s a good way to fix this issue? I can think of oversampling, since different topologies will be selected for the same edge, and the model will be exposed to multiple views.
- Any reference to papers dealing with such issues?