Differences between RGCN implementation and the paper


I’m currently reading the paper and RGCN implementation and I found a few differences. I want to double check if I need to add them myself to get the original RGCN or if it’s already there and I’m missing something.

There is a W^{(l)}_{0} in the paper which I can’t find it in the code.
There is a sigmoid (non-linearity) in the paper.

Can you please provide more info on these?

Hi @bfatemi,

The RGCN implementation in DGL matches the author’s code with best efforts. The two points you mentioned are both implemented in the model.

The W^{(l)}_{0} in equation (2) is implemented as the loop message. See code here.

The sigmoid is also implemented in the link prediction example. The line here uses binary_cross_entropy_with_logits function from PyTorch, and if you check out PyTorch’s doc, it says:

This loss combines a Sigmoid layer and the BCELoss in one single class.

1 Like