RGCN parameters: in_features, hidden_features, out_features

ogggcar · June 30, 2021, 3:52pm

Hi.

I aim to use a RGCN for link predicition so I am following: 5.3 Link Prediction — DGL 0.6.1 documentation

When getting to final step, I dont fully understand the following code:

def compute_loss(pos_score, neg_score):
# Margin loss
n_edges = pos_score.shape[0]
return (1 - pos_score.unsqueeze(1) + neg_score.view(n_edges, -1)).clamp(min=0).mean()

k = 5
model = Model(10, 20, 5, hetero_graph.etypes)
user_feats = hetero_graph.nodes[‘user’].data[‘feature’]
item_feats = hetero_graph.nodes[‘item’].data[‘feature’]
node_features = {‘user’: user_feats, ‘item’: item_feats}
opt = torch.optim.Adam(model.parameters())
for epoch in range(10):
negative_graph = construct_negative_graph(hetero_graph, k, (‘user’, ‘click’, ‘item’))
pos_score, neg_score = model(hetero_graph, negative_graph, node_features, (‘user’, ‘click’, ‘item’))
loss = compute_loss(pos_score, neg_score)
opt.zero_grad()
loss.backward()
opt.step()
print(loss.item())

Those 10,20,5 in the Model arguments… where they come from? Checking the Model function parameters from the model defition I see they are in_features, hidden_features, out_features.

My question is: how can I guess those numbers for my graph?

As I said, my model has the same parameters as the tutorial (with a different graph, obviously).

Thanks

BarclayII · July 5, 2021, 6:58am

in_features corresponds to the size of your input features.
out_features corresponds to the size of your output, usually the number of classes for classification or 1 for regression.
hidden_features corresponds to the size of your hidden state, where you set it as a hyperparameter.

ogggcar · July 6, 2021, 7:53am

Thank you so much. One last thing. The number of classes, are in this cases the number of types of relations in the graph?

BarclayII · July 12, 2021, 1:49am

No, they are the number of output node classes for node classification. Different from the number of relations which tells how many edge types you have.

ogggcar · July 12, 2021, 5:29am

But should i still use number of node classes if this is a link prediction task?

BarclayII · July 12, 2021, 5:41am

Oh I didn’t notice. In this case you don’t need the number of classes. You can set out_features to be the same as hidden_features so that the representations can be fed into the score predictor to produce the final positive/negative scores.

ogggcar · July 29, 2021, 3:33pm

Thanks again. When trying my model I came to the conclusion that the hidden_features should be the number of nodes of my graph, and the in_features the node embeddings dimensions. Else, the code produces errors. Is this right?

I think batching could be an alternative to this, but I dont know how to use it in this RGCN code.

BarclayII · August 2, 2021, 6:32am

No. hidden_features should be independent of the number of nodes in the graph. It is just the number of dimensions of the features in the hidden layers.

ogggcar · August 2, 2021, 12:12pm

Thanks again.

I was using the node embeddings dimension (700) as in_dim and the num_nodes (2100) as the hidden one. If I tried another values i got this type of error:

RuntimeError: The size of tensor a (2100) must match the size of tensor b (500) at non-singleton dimension 0

Now i see that I can use the embeddings_dim both as input and hidden dim. That would be correct, right? No strange errors doing it this way. Could my final model be to following, using the node embedding dimensions as in, hidden and out?

node_embedding_dim = 700

model(node_embedding_dim, node_embedding_dim, node_embedding_dim, g.etypes)

Everything appears to work fine. Should I use a different output_dim?

Thank you so much.

BarclayII · August 4, 2021, 2:25am

What was the stack trace? Setting the number of dimensions the same as the number of nodes sounds extremely unnatural because it essentially says that the number of parameters of the weight matrices would scale with the size of the graph. It’s likely that your model implementation is not right.

ogggcar · August 4, 2021, 5:44am

Yes, you right, number of nodes was a mistake. Let’s forget about it. Thanks again, really helpful.

Now, could I use the node embeddings dimension as the 3 parameters, like I told you before?

Now my model implementation looks like this, basically same as in the tutorial:

def construct_negative_graph(graph, k, etype):
    utype, _, vtype = etype
    src, dst = graph.edges(etype=etype)
    neg_src = src.repeat_interleave(k)
    neg_dst = torch.randint(0, graph.num_nodes(vtype), (len(src) * k,))
    return dgl.heterograph(
        {etype: (neg_src, neg_dst)},
        num_nodes_dict={ntype: graph.num_nodes(ntype) for ntype in graph.ntypes})

class RGCN(nn.Module):
    def __init__(self, in_feats, hid_feats, out_feats, rel_names):
        super().__init__()

        self.conv1 = dglnn.HeteroGraphConv({
            rel: dglnn.GraphConv(in_feats, hid_feats)
            for rel in rel_names}, aggregate='sum')
        self.conv2 = dglnn.HeteroGraphConv({
            rel: dglnn.GraphConv(hid_feats, out_feats)
            for rel in rel_names}, aggregate='sum')

    def forward(self, graph, inputs):
        # inputs are features of nodes
        h = self.conv1(graph, inputs)
        h = {k: F.relu(v) for k, v in h.items()}
        h = self.conv2(graph, h)
        return h

class Model(nn.Module):
    def __init__(self, in_features, hidden_features, out_features, rel_names):
        super().__init__()
        self.sage = RGCN(in_features, hidden_features, out_features, rel_names)
        self.pred = HeteroDotProductPredictor()
    def forward(self, g, neg_g, x, etype):
        h = self.sage(g, x)
        return self.pred(g, h, etype), self.pred(neg_g, h, etype)

def compute_loss(pos_score, neg_score):
    # Margin loss
    n_edges = pos_score.shape[0]
    return (1 - pos_score.unsqueeze(1) + neg_score.view(n_edges, -1)).clamp(min=0).mean()

embeddings_dimensions = len(g.ndata['Feats'][1])  # = 768 

model = Model(embeddings_dimensions, embeddings_dimensions, embeddings_dimensions, g.etypes)

opt = torch.optim.Adam(model.parameters())  # Adam optimizer
node_feats = g.ndata['Feats']

Training loop is the same as in the tutorial I mentioned before.

Should I change the value of any of those parameters?

Thanks you so much.

BarclayII · August 4, 2021, 6:45am

It seems that your first layer of RGCN directly takes the node features as input. In this case, you will need to make sure that RGCN’s in_features should match the number of dimensions of your input node features. Otherwise it looks fine to me.

ogggcar · August 4, 2021, 8:23am

I make sure of it with this line, right?

BarclayII · August 4, 2021, 8:29am

Right. I missed it.

(Minimum 20 characters limit)

ogggcar · August 4, 2021, 8:33am

No problem. On last thing. Could it help if intead of the mentioned:

model = Model(embeddings_dimensions, embeddings_dimensions, embeddings_dimensions, g.etypes)

I do something like:

model = Model(embeddings_dimensions, embeddings_dimensions * 2, embeddings_dimensions * 2, g.etypes)

I see that in the tutorial the hidden layer has twice to size of the input layer (or at least, bigger) . I understand it isn’t necessary, but dont know it it could be advisable?

Thanks for everything.

BarclayII · August 9, 2021, 1:19pm

That is usually a fixed number or a hyperparameter you tune a little (with grid search or random search for instance), not necessarily twice the size of the input layer.

system · September 8, 2021, 1:19pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.