Node Regression Example

I am new to graph neural networks. I am trying to run a node regression, but most available examples are of classification. I understand that I can modify the pipeline for classification to do regression, for example, have an out put feature of length 1 and also the MSE loss function. I am not sure how to calculate the loss in training while iterating through epochs. Here is a sample code for classification I got from DGL, can someone point me the changes to do for regression:

def train(g, model):
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
best_val_acc = 0
best_test_acc = 0

features = g.ndata['feat']
labels = g.ndata['label']
train_mask = g.ndata['train_mask']
val_mask = g.ndata['val_mask']
test_mask = g.ndata['test_mask']
for e in range(100):
    # Forward
    logits = model(g, features)

    # Compute prediction
    pred = logits.argmax(1)

    # Compute loss
    # Note that you should only compute the losses of the nodes in the training set.
    loss = F.cross_entropy(logits[train_mask], labels[train_mask])

    # Compute accuracy on training/validation/test
    train_acc = (pred[train_mask] == labels[train_mask]).float().mean()
    val_acc = (pred[val_mask] == labels[val_mask]).float().mean()
    test_acc = (pred[test_mask] == labels[test_mask]).float().mean()

    # Save the best validation accuracy and the corresponding test accuracy.
    if best_val_acc < val_acc:
        best_val_acc = val_acc
        best_test_acc = test_acc

    # Backward

    if e % 5 == 0:
        print('In epoch {}, loss: {:.3f}, val acc: {:.3f} (best {:.3f}), test acc: {:.3f} (best {:.3f})'.format(
            e, loss, val_acc, best_val_acc, test_acc, best_test_acc))

model = GCN(g.ndata[‘feat’].shape[1], 16, dataset.num_classes)
train(g, model)

Or if someone has any example for node regression that they can point me towards that would be vey helpful.


Since you already changed the output dimension. The rest you need to change is the loss computation and evaluation metric, in particular the computation of loss, train_acc, val_acc and test_acc.

The evaluation metric is what I am struggling with. I believe I need to change it to R_squared, but can’t exactly get the right answer.

Say your ground truth values are labels and the predictions are pred. You can compute R-squared via the following:

tot = ((labels - labels.mean()) ** 2).sum()
res = ((labels - pred) ** 2).sum()
r2 = 1 - res / tot

You can either compute this score once per minibatch or compute a global one by comparing all the predictions and their corresponding ground truths.