Inductive binary node classification setup

ar795 · October 20, 2020, 11:38pm

Hello everyone !
I have a question regarding the inductive binary node classification setup.
To be clear let’s consider the following setup :
I have a training dataset that consists of N graphs. each graph G_i, i =1,2,…,N

For each graph the aim is to predict a binary label for its nodes. Is this the same task you presented in the GAT on the PPI dataset(that case was multiclass)?
Can we use GrapheSage the same way you used GAT in that example ?
Graphs in the dataset are independent , to generate the node level emebeddings I will batch them, the crossy entropy loss will not be affected (since when we batch we just construct a giant DGLgraph and then we compute the loss for all nodes)
Thanks in advance.

mufeili · October 21, 2020, 4:07am

For each graph the aim is to predict a binary label for its nodes. Is this the same task you presented in the GAT on the PPI dataset(that case was multiclass)?

Yes, it’s the same, except that PPI is for multi-label binary classification.

Can we use GrapheSage the same way you used GAT in that example ?

Yes, you only need to change the loss function and evaluation metric.

Graphs in the dataset are independent , to generate the node level emebeddings I will batch them, the crossy entropy loss will not be affected (since when we batch we just construct a giant DGLgraph and then we compute the loss for all nodes)

Right.

ar795 · October 21, 2020, 9:23am

Thanks @mufeili for you reply.
regarding the question 2. you mean I have to use a BCELoss that’s it ?

mufeili · October 21, 2020, 6:19pm

Yes, or BCEWithLogitsLoss as in the GAT example.

ar795 · October 22, 2020, 11:42pm

Hi @mufeili,
I Just implemented my approach according to the setup I described before. but the loss is not getting decreasing
I checked gradients and I found that they are in the order of 10**-18 ,which means they are close to zero.
any suggestions please ?

mufeili · October 23, 2020, 2:15am

How many data points do you have for the positive class and the negative class? If there is a class imbalance issue, you may want to perform some up sampling/down sampling in data loading or reweight data points in the loss computation.

ar795 · October 23, 2020, 12:10pm

Exactly the dataset is imbalanced , you think the focal loss can solve the problem ?

mufeili · October 23, 2020, 12:43pm

I will start with re-weighting the loss for each data point or up/down sampling. Focal loss might be also worth trying.

ar795 · October 23, 2020, 1:20pm

I defined the loss function as follows:
loss_fcn = torch.nn.BCEWithLogitsLoss(pos_weight=torch.tensor(len(neg)/len(pos)))
where len(neg) is the number of the nodes with label = 0 and len(pos) represents the ones with label =1. But the loss decreases only from the first epoch to the second one and then it remains unchanged