Correct way to call forward() on block for heterogenous layer

Hi,

I’m trying to write a custom Conv layer for heterogenous graphs(multiple node/edge types). This is the relevant code of the forward() function -

    def forward(self, graph, inputs):
        # The input is a dictionary of node features for each type
        funcs = {}
        for srctype, etype, dsttype in graph.canonical_etypes:
            rel_graph = graph[srctype, etype, dsttype]
            if rel_graph.number_of_edges() == 0:
                continue
            h = inputs[srctype]
            graph.nodes[srctype].data['h_%s' % etype] = h
            funcs[etype] = (fn.copy_u('h_%s' % etype, 'm'), fn.mean('m', 'h'))
        graph.multi_update_all(funcs, 'sum')

        return {ntype : graph.nodes[ntype].data['h'] for ntype in graph.ntypes}

When my graph parameter is a Block (by using EdgeDataLoader()), I get an error KeyError: 'h' on the last line of forward().
If I pass the whole graph instead of a block, I get the expected output.
How do I fix this? Let me know if I need to add more code for context if necessary.

Thanks a lot,
Sachin.

Can you check if the block has edges for all canonical edge types? There’s a chance that the block does not have edges for some canonical edge types and you end up not having h for some node types.

I do think that all the edge types have edges. But correct me if I’m wrong here -
The full graph and the block on which it errors is -

Graph(num_nodes={'image': 3, 'tag': 2},
      num_edges={('image', 'hasTag', 'tag'): 4, ('image', 'similarTo', 'image'): 6, ('tag', 'hasImage', 'image'): 4},
      metagraph=[('image', 'tag', 'hasTag'), ('image', 'image', 'similarTo'), ('tag', 'image', 'hasImage')])

Block(num_src_nodes={'image': 3, 'tag': 2},
      num_dst_nodes={'image': 3, 'tag': 2},
      num_edges={('image', 'hasTag', 'tag'): 4, ('image', 'similarTo', 'image'): 5, ('tag', 'hasImage', 'image'): 4},
      metagraph=[('image', 'tag', 'hasTag'), ('image', 'image', 'similarTo'), ('tag', 'image', 'hasImage')])

I have attached the code for context. Let me know if you need more details and I’ll prepare and upload a small example file. I’m not sure how to proceed from this point.

Thanks a lot for helping!

Warm regards,
Sachin.

Code for context

Graph and features -

    graph_data = {
        ('image', 'similarTo', 'image'): (th.tensor([0, 1, 2, 1, 0, 0]), th.tensor([1, 0, 0, 0, 1, 2])),
        ('image', 'hasTag', 'tag'): (th.tensor([0, 1, 2, 2]), th.tensor([0, 1, 0, 1])),
        ('tag', 'hasImage', 'image'): (th.tensor([0, 1, 0, 1]), th.tensor([0, 1, 2, 2]))
    }
    g = dgl.heterograph(graph_data)

    inputs = {}
    inputs['image'] = th.rand(3, 3)
    inputs['tag'] = th.rand(2, 3)

Dataloader and training -

    train_seeds = {
        'similarTo': th.arange(6),
        'hasTag': th.arange(4),
        'hasImage': th.arange(4)
    }
    sampler = dgl.dataloading.MultiLayerNeighborSampler(
        [2, 2])
    train_dataloader = dgl.dataloading.EdgeDataLoader(
        g, train_seeds, sampler,
        negative_sampler=dgl.dataloading.negative_sampler.Uniform(2),
        batch_size=2,
        shuffle=True,
        drop_last=False,
        pin_memory=True,
        num_workers=1)

    for step, (input_nodes, pos_graph, neg_graph, blocks) in enumerate(train_dataloader):
        batch_inputs = {x: inputs[x][input_nodes[x]] for x in input_nodes}
        pos_graph = pos_graph
        neg_graph = neg_graph
        blocks = [block.int() for block in blocks]
        # Compute loss and prediction
        batch_pred = model(blocks, batch_inputs)
        # Loss, backprop, etc.

forward() function for my NN HeteroSAGE(my above forward() was for the layer HeteroSAGEConv) -

    def forward(self, blocks, x):
        h_dict = x
        for l, (layer, block) in enumerate(zip(self.layers, blocks)):
            h_dict = layer(block, h_dict) # Error happens while this is executing
            if l != len(self.layers) - 1:
                h_dict = {k : self.activation(h) for k, h in h_dict.items()}
        return h_dict

Can you provide a ready-to-run script for reproducing the error?

Sure. I attached a small example(not my exact NN, but the smallest code I could write to reproduce the error) .

What I also noticed was that if I run test_graph() before running test_blocks(), it succeeds. Might be because test_graph updates ndata['h'] of the nodes so there’s no KeyError anymore, but that means that test_blocks() isn’t writing to the ndata['h'] itself and the problem still stands.

Let me know if you need anything more from me. I really appreciate your help in solving this!

Thanks,
Sachin.

Code link - https://github.com/s4chin/zyxxy/blob/dgl-forum-example/sample.py

For blocks, you need to replace return {ntype : graph.nodes[ntype].data['h'] for ntype in graph.ntypes} with return {ntype : graph.dstnodes[ntype].data['h'] for ntype in graph.ntypes} as in the doc here.

1 Like

Ah, I see! Thank you @mufeili !

One last question related to this -
Sometimes the block doesn’t have any dstnode of certain ntype due to edges not being there which will result in the same error on calling return {ntype: graph.dstnodes[ntype].data['h'] for ntype in graph.ntypes}.
Is there a way to force all edge types to be present during batch creation? This won’t be an issue for large graphs since the probability of this happening will be close to 0, but was just curious.