Custom NegativeSampler for Heterogeneous Graphs

I was referring to the following link to implement the NegativeSampler for Heterogeneous Graphs. However, I am getting 0 edges into the negative graph.

Also, there was a small mistake in the documentation in the above link: It should be

train_eid_dict = {etype: g.edges(etype=etype, form='eid') for etype in g.etypes} instead of train_eid_dict = {g.edges(etype=etype, form='eid') for etype in g.etypes}

When I made the above change and pass it to the edge data loader, It’s neither not throwing any error nor giving any negative edges.

Replicated the problem using the below sample code. Can someone debug where is it going wrong?

g = dgl.heterograph({
                      ('user', 'watched', 'item'): (torch.tensor([0, 1]), torch.tensor([1, 2])),
                      ('item', 'watched-by', 'user'): (torch.tensor([0, 1]), torch.tensor([2, 3])),

class HeteroNegativeSampler(object):
  def __init__(self, g, k):
    # caches the probability distribution
    self.weights = {
      etype: g.in_degrees(etype=etype).float() ** 0.5
      for _, etype, _ in g.canonical_etypes
    self.k = k

  def __call__(self, g, eids_dict):
    result_dict = {}
    for etype, eids in eids_dict.items():
      src, _ = g.find_edges(eids, etype=etype)
      src = src.repeat_interleave(self.k)
      dst = self.weights[etype].multinomial(len(src), replacement=True)
      result_dict[etype] = (src, dst)
    return result_dict
train_eid_dict = {etype: g.edges(etype=etype, form='eid') for etype in g.etypes}
sampler        = dgl.dataloading.MultiLayerFullNeighborSampler(2)
dataloader     = dgl.dataloading.EdgeDataLoader(
                    g, train_eid_dict, sampler,
                    negative_sampler=HeteroNegativeSampler(g, 5),

input_nodes, pos_graph, neg_graph, bipartites = next(iter(dataloader))

print('Positive graph # nodes:', pos_graph.number_of_nodes(), '# edges:', pos_graph.number_of_edges())
print('Negative graph # nodes:', neg_graph.number_of_nodes(), '# edges:', neg_graph.number_of_edges())

Positive graph # nodes: 7 # edges: 4
Negative graph # nodes: 7 # edges: 0

The documentation is wrong. It is fixed in [Doc]Fix examples on negative graph sampler by Theheavens · Pull Request #2863 · dmlc/dgl · GitHub.

1 Like

For negative and positive sampler graph I split my heterogeneous graph two section one for negative and another one for positive, how to change the code?
hetero_pos_net = dgl.heterograph({
hetero_neg_net = dgl.heterograph({


You don’t need to split the graph manually. Dataloader takes care of this thing. If you wanted to customize the negative sampling technique, You can create a separate class like the one above and add that into dataloader like below

negative_sampler=HeteroNegativeSampler(g, 5)

1 Like

well, the score column in the dataset determines the negative and positive sample for that I think should split, then don’t need, so just for negative create a separate class or for both?

the output is:
Positive graph # nodes: 284 # edges: 1024
Negative graph # nodes: 284 # edges: 0
why the value of the negative edge is zero?

There was some bug in the code. You can modify the above code like mentioned in the PR linked above Custom NegativeSampler for Heterogeneous Graphs - #2 by BarclayII

1 Like

Thanks a lot for help