Node Regression Example

pmish · July 2, 2021, 9:42pm

I am new to graph neural networks. I am trying to run a node regression, but most available examples are of classification. I understand that I can modify the pipeline for classification to do regression, for example, have an out put feature of length 1 and also the MSE loss function. I am not sure how to calculate the loss in training while iterating through epochs. Here is a sample code for classification I got from DGL, can someone point me the changes to do for regression:

def train(g, model):
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
best_val_acc = 0
best_test_acc = 0

features = g.ndata['feat']
labels = g.ndata['label']
train_mask = g.ndata['train_mask']
val_mask = g.ndata['val_mask']
test_mask = g.ndata['test_mask']
for e in range(100):
    # Forward
    logits = model(g, features)

    # Compute prediction
    pred = logits.argmax(1)

    # Compute loss
    # Note that you should only compute the losses of the nodes in the training set.
    loss = F.cross_entropy(logits[train_mask], labels[train_mask])

    # Compute accuracy on training/validation/test
    train_acc = (pred[train_mask] == labels[train_mask]).float().mean()
    val_acc = (pred[val_mask] == labels[val_mask]).float().mean()
    test_acc = (pred[test_mask] == labels[test_mask]).float().mean()

    # Save the best validation accuracy and the corresponding test accuracy.
    if best_val_acc < val_acc:
        best_val_acc = val_acc
        best_test_acc = test_acc

    # Backward
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if e % 5 == 0:
        print('In epoch {}, loss: {:.3f}, val acc: {:.3f} (best {:.3f}), test acc: {:.3f} (best {:.3f})'.format(
            e, loss, val_acc, best_val_acc, test_acc, best_test_acc))

model = GCN(g.ndata[‘feat’].shape[1], 16, dataset.num_classes)
train(g, model)

Or if someone has any example for node regression that they can point me towards that would be vey helpful.

Thanks!

BarclayII · July 5, 2021, 6:45am

Since you already changed the output dimension. The rest you need to change is the loss computation and evaluation metric, in particular the computation of loss, train_acc, val_acc and test_acc.

pmish · July 8, 2021, 2:42am

The evaluation metric is what I am struggling with. I believe I need to change it to R_squared, but can’t exactly get the right answer.

BarclayII · July 12, 2021, 1:57am

Say your ground truth values are labels and the predictions are pred. You can compute R-squared via the following:

tot = ((labels - labels.mean()) ** 2).sum()
res = ((labels - pred) ** 2).sum()
r2 = 1 - res / tot

You can either compute this score once per minibatch or compute a global one by comparing all the predictions and their corresponding ground truths.

Francheesecake · December 13, 2021, 4:41pm

Hi,

I’m also working with node regression, using MSE Loss. I’m using Pythorch MSE Loss function

import torch.nn.functional as F

for epoch in range(epochs):
        outputs = net(g, features)[train_mask] # Filter train data
        # Calculate MSE Loss
        loss = F.mse_loss(outputs, scores)

scores are the correct prediction values/

gudeh · February 10, 2023, 3:42pm

May I ask what is reason for choosing MSE and not other options available. This depends on each implementation, correct?