Negative sampling in distributed graph?

We are working on a link prediction model. We have read this page (Distributed training β€” DGL 0.7.0 documentation), but looks like there is only node classification example.

What is the best way to do negative sampling in a distributed setting? Do we already have APIs (similar to sample_neighbor) to do that?

in the master branch, you can use EdgeDataLoader to sample positive edges and negative edges. we’ll write a tutorial for distributed link prediction.

1 Like

Does that mean we have use EdgeDataLoader directly even if the graph is DistGraph? How is it different DistDataLoader? It seems that DistDataLoader can only sample nodes.

Another question? When using EdgeDataLoader in DistGraph, what sampling method should we use? There are three options: dgl.distributed.graph_services.sample_neighbors(), dgl.dataloading.neighbor.MultiLayerNeighborSampler(), and dgl.sampling.sample_neighbors() ? Is it necessary to use a distributed sampling method? (It seems all these three methods support distributed sampling based on their implementation.)

The MultiLayerNeighborSampler will dispatch the underlying method to distributed version based on whether the input graph is distgraph. Thus it’s compatible for both DGLGraph and DistGraph

We are fixing the negative sampler for distributed training scenario

1 Like

Thanks for the update! I was wondering if dataloading.negative_sampler works in the distributed mode, glad to hear that now it i works!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.