Accuracy in multiclass edge classification

grendelaglaeca · April 15, 2022, 11:11am

I’m looking for a way to measure accuracy of an edge classification model (after this tutorial. How could I modify this snippet from node classification model to work with the edge classifier? Right now it shows 0.0 accuracy in every epoch.

train_acc = (pred[train_mask] == labels[train_mask]).float().mean()
val_acc = (pred[val_mask] == labels[val_mask]).float().mean()
test_acc = (pred[test_mask] == labels[test_mask]).float().mean()

Additionally, what would be the best way of splitting graph edges into test/train/val?

minjie · April 17, 2022, 4:27am

The link you showed is indeed an edge classification model. I don’t understand what you try to modify.

Additionally, what would be the best way of splitting graph edges into test/train/val?

For node classification and link prediction, we added two new APIs AsNodePredDataset and AsLinkPredDataset (link). For edge classification, you could try to mimic how the APIs are implemented. Generally, there are two steps:

Decide the dataset split approach. A common way is by randomly select some edges to be train/test/val.
Generate mask tensors and save them to the edata of your graph.

grendelaglaeca · April 17, 2022, 11:33am

Thank you for your reply!

Sorry for the confusion. What I mean is that the code snippet I shared comes from the node classification task, while I want to write a similar function to measure edge classification accuracy. Specifically, I would like to compute the percentage of correctly predicted edge labels (in my case these are one-hot coded tensors).

minjie · April 17, 2022, 2:03pm

Your code looks correct. What’s the shape of pred and labels?

grendelaglaeca · April 17, 2022, 9:38pm

The shape of pred is (num_edges, 1) (I assume it’s the dot product score of the incident nodes?) and of labels (num_edges, 49) (because there are 49 one-hot coded classes).

mufeili · April 18, 2022, 9:00am

Merely taking the dot product of a pair of node representations is enough for link prediction, but not if you have multiple edge classes.

minjie · April 18, 2022, 1:10pm

@mufeili is right. Try use the EdgePredictor module with cat operator and set out size to be number of classes (49 in your case). Use argmax to get the model prediction. You should also use an integer label array instead of an one-hot encoding.

grendelaglaeca · April 18, 2022, 6:48pm

Thank you both. Just to make sure, by an integer label array do you mean that each edge should be labeled with a single integer (e.g. from LabelEncoder)?

minjie · April 19, 2022, 1:10am

Yes. <20 characters>

system · May 19, 2022, 1:10am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.