RGCN and shared weights

YohanObadia · August 15, 2019, 4:41pm

Hi,

I am new to this library and am trying to understand the implementation of the RGCN model.
The formula for an RGCN layer is:

h_i^{l+1} = \sigma\left(W_0^{(l)}h_i^{(l)}+\sum_{r\in R}\sum_{j\in N_i^r}\frac{1}{c_{i,r}}W_r^{(l)}h_j^{(l)}\right)

Hence, W_r for a given layer should have the same value for all nodes for a given rel_type r. However, from what I understand of the implementation in the tutorial, weight for the first layer is of the shape (num_rels, num_nodes, out) which means that each node i gets its own value W_{r_i}. Why was it implemented this way ?

Thanks in advance for any clarification

lingfan · August 15, 2019, 6:44pm

The weight for each layer always has shape: (num_rels, in_feats, out_feats). However, for the first layer, because it’s using node id as input feature and the node id here is using one-hot encoding, the input feature equals num_node.

So actually, the linear transformation using weight happens on edges (not nodes), and edges of the same type share the same weight W.

YohanObadia · August 16, 2019, 10:31pm

Thanks you! I got a way better understanding of the GCN and R-GCN papers thanks to your feedbacks and more digging!

What would happen if you have a different number of features for different types of relations ? Would you define separate RGCN layers and at some point a pooling between them ?

I am trying to implement an RGCN on molecular data for the Kaggle competition Predicting Molecular Properties as a way to learn more about Pytorch and GCNs at the same time and there might clearly be different number of feature for different type of relations.

lingfan · August 24, 2019, 7:11am

Hi @YohanObadia,

I think the feature you want is actually related to heterogeneous graph. Please check out our v0.4 road map on GitHub. Basically, nodes and edges can have different types, and each type can have different features. This is actually not trivial to implement in v0.3 DGL API. A common workaround we use for now is actually pretending all the nodes / edges have the same types, and then doing some padding on features if the node/edge does not have this feature. This is not computationally efficient, and will be improved in next DGL release.