How to handle different dimension of input node features

yolo · August 22, 2021, 4:16pm

Hello
I found a similar question but it didn’t satisfy my query.

I am following the official dgl link prediction tutorial.

My graph has two nodes (user, item). Each node has different feature dimension. I get the following error:

DGLError: Dot operator is only available for arrays with the same size on last dimension, but got torch.Size([5]) and torch.Size([3]).

Following is the code I am using:

n_users = 1000
n_items = 500
n_clicks = 5000
n_hetero_features = 10

click_src = np.random.randint(0, n_users, n_clicks)
click_dst = np.random.randint(0, n_items, n_clicks)

hetero_graph = dgl.heterograph({
    ('user', 'click', 'item'): (click_src, click_dst),
})

hetero_graph.nodes['user'].data['h'] = torch.randn(n_users, 5)
hetero_graph.nodes['item'].data['h'] = torch.randn(n_items, 3)
hetero_graph.edges['click'].data['h'] = torch.randint(1, 2, (hetero_graph.number_of_edges(),))

# k negative samples
k = 5

in_features = hetero_graph.ndata['h']['user'].shape[1]
hidden_features = 20
out_features = 5

model = Model(in_features, hidden_features, out_features, hetero_graph.etypes)

user_feats = hetero_graph.nodes['user'].data['h']
item_feats = hetero_graph.nodes['item'].data['h']
node_features = {'user': user_feats, 'item': item_feats}
opt = th.optim.Adam(model.parameters())

for epoch in range(10):
    negative_graph = construct_negative_graph(hetero_graph, k, ('user', 'click', 'item'))
    pos_score, neg_score = model(hetero_graph, negative_graph, node_features, ('user', 'click', 'item'))
    loss = compute_loss(pos_score, neg_score)
    opt.zero_grad()
    loss.backward()
    opt.step()
    print(loss.item())

Could someone please help me with following questions:

How do I fix this error? Why is this error happening?
if my nodes have different dimensions (in this case 5 & 3), what should I set as in_features values ?
What is out_features value? In my example, is it the number of unique items or unique users?
How should I tell the model to use edge feature as well ?

I have scanned through github issues, this forums answer currently I’m quite confused how to go forward.

Regards
yolo

BarclayII · August 23, 2021, 6:36am

Seems that you were using u_dot_v for predicting scores, and a dot product can only operate on vectors with the same dimensionality. One 5-dim vector and another 3-dim vector would not work. You will either need to replace your u_dot_v call in the score predictor with your own function (e.g. MLPPredictor in 5.2 Edge Classification/Regression — DGL 0.7.0 documentation), or make their dimensionalities the same by projecting the features with an MLP.

in_features should be the same as the dimensionality of the input node features to your GNN model. It can either be your initial node feature size or the output of an initial MLP that projects your initial node features.

It should be the size of whatever feature you feed into the score predictor. For link prediction it can be any number as a hyperparameter, but usually it’s irrelevant to the number of items or users.

You will need to write your own message passing functions in your model. See for instance item 13 in Frequently Asked Questions (FAQ).

system · September 22, 2021, 6:37am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.