Does my understanding of GCN implementation right?

From https://docs.dgl.ai/tutorials/basics/1_first.html

def gcn_message(edges):
    return {'msg' : edges.src['h']}

def gcn_reduce(nodes):
    return {'h' : torch.sum(nodes.mailbox['msg'], dim=1)}


class GCNLayer(nn.Module):
    def __init__(self, in_feats, out_feats):
        super(GCNLayer, self).__init__()
        self.linear = nn.Linear(in_feats, out_feats)

    def forward(self, g, inputs):

        g.ndata['h'] = inputs

        g.send(g.edges(), gcn_message)

        g.recv(g.nodes(), gcn_reduce)

        h = g.ndata.pop('h')

        return self.linear(h)

My question is about gcn_message and gcn_reduce.

If we pass all the edges into gcn_message, does it mean that each edge will send message from its own node?

If we pass all the nodes into gcn_reduce, does it mean that each node will recv the message from all of its edges?

if I edit gcn_reduce to

def gcn_reduce(nodes):
    recv_msg = nodes.mailbox['msg'] # [1,9,34]
    recv_msg2 = torch.sum(recv_msg, dim=1) # [1,34]
    return {'h' : recv_msg2}

I debug and stop at the first time. What does [1,9,34] mean in the code above?

Thank you.

DGL did degree bucketing to accelerate the computation. The computation on nodes with same in-degrees are batched together. For example, if both node 1 and 2 have 3 in edges, the incoming message would be batched together, therefore the shape of nodes.mailbox['msg'] would be [2(bucket size, two nodes has same in degrees), 3(node’s degree is 3), feat_size].

Thank you!

If we pass all the edges into gcn_message , does it mean that each edge will send message from its own node?

If we pass all the nodes into gcn_reduce , does it mean that each node will recv the message from all of its edges?

Am I right?

I am sorry.

“both node 1 and 2 has 3 in edges”

Could you please upload a figure to describe this sentence?

Thank you!

It just means node has 3 edges. Since DGLGraph is a directed graph, therefore I say 3 in edges (incoming edges).

Thank you.

I understand.

Each node has 3 incoming edges and there exists 2 nodes.

I had better improve my English language.

Hey all,

I have following this example as well, and have a question about it.

labeled_nodes = torch.tensor([0, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1]) # their labels are different

Based on my understanding, those lines assume that only node 0 and 33 are known (I think because this example demonstrate semi-supervised technique). My question is what if all the nodes are known? Can I still modify this code to make it supervised technique?

Thanks in advance.

The GCN example (and the original GCN paper) works under a semi-supervised setting, meaning that you have the entire graph beforehand but labels on only some of the nodes, and you need to predict the labels of the rest of nodes.

Yes. You can use all labels if you have. However you may need to split train/validation/test set for generalization.

Thanks @VoVAllen,

Let me give a try.

And to follow it up, once I have trained a model, save it, and load it again, how can I infer the nodes in a new graph? I tried to find a demo about it, but most of the demos usually perform train-test-eval in one go.

Thanks again