Training a Heterogenous Graph having different node feature lengths

Hey,
I am very new to handling heterographs. However, I have created a heterograph with node types A,B,C,and D.
The feature-length of A is is 416, the feature-length of B is 21, C is 811 and D is 162. I have binary labels for node A. A is related to B,C and D and has a self loop for each node in A. Ideally, I want to do a supervised node classification where I want to correctly predict the labels of A.
All the examples I have seen over here seem to consider the number of features for each node type to be equal (https://docs.dgl.ai/en/0.6.x/guide/training-node.html) and (https://docs.dgl.ai/guide/training.html).
How would I replicate what is being done in these examples for different node feature lengths? I’ll be really grateful if you help me with this.

1 Like

Not sure if this is possible with what’s implemented in DGL. In Section 2.1 of RGCN paper,

h_{i}^{(l+1)} = \sigma(\sum_{r \in \mathcal{R}} \sum_{j \in \mathcal{N}_i^r} \frac{1}{c_{i,r}} W_r^{(l)}h_j^{(l)} + W_0^{(l)}h_i^{(l)})

This assumes that h_j^{(l)} and h_i^{(l)} have the same dimension. You could introduce a new parameter W'_r \in \mathbb{R}^{a_r \times b}, where h_j^{(l)} \in \mathbb{R}^{a_r} and h_i^{(l)} \in \mathbb{R}^{b}. And compute hidden dimension with

h_{i}^{(l+1)} = \sigma(\sum_{r \in \mathcal{R}} \sum_{j \in \mathcal{N}_i^r} \frac{1}{c_{i,r}} W_r^{(l)}h_j^{(l)}W'_r + W_0^{(l)}h_i^{(l)})

1 Like

You can use a linear layer to project the features of each node type so that they have the same length at the beginning. You may find TypedLinear helpful.

1 Like

Hey, so I am trying to use TypedLinear on my feature. An example of my feature data is as follows:
It is a 4055*416 tensor. So, my code was, as follows:

m = TypedLinear(416, 20, 5)
x_type = torch.randint(0, 5, (4055,))
y = m(G.nodes['node'].data['feature'],x_type)

I got the following error:

File “C:\Users\akash\anaconda3\lib\site-packages\torch\nn\modules\module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)

File “C:\Users\akash\anaconda3\lib\site-packages\dgl\nn\pytorch\linear.py”, line 174, in forward
return gather_mm(x, w, idx_b=x_type)

File “C:\Users\akash\anaconda3\lib\site-packages\dgl\ops\gather_mm.py”, line 40, in gather_mm
return torch.index_select(F.segment_mm(sorted_a, b, seglen), 0, rev_perm)

File “C:\Users\akash\anaconda3\lib\site-packages\dgl\backend\pytorch\sparse.py”, line 831, in segment_mm
C.append(A[off:off+seglen_A[i]] @ B[i])

RuntimeError: expected scalar type Double but found Float

So, I thought of rewriting the last line as:

y = m(G.nodes['node'].data['feature'].to(torch.double),x_type)

But I still got the same error. Do you have any idea what I am doing wrong?

Yeah this was what I was thinking as well. But I encountered problems with this approach as well. :confused:

Sorry for the confusing error. Could you tell me the shape and data type of x_type and G.nodes['node'].data['feature']?

The shape of x_type is torch.Size([4055]) and the data type is torch.int64. The Datatype of G.nodes[‘node’].data[‘feature’] is torch.float64 and the shape is torch.Size([4055, 416]).

Could you try changing the dtype of G.nodes[‘node’].data[‘feature’] to float32?

Yes, I did that. Thank you!
However, instead of using TypedLinear explicitly, I decided to pass the features through a torch.nn.linear layer. That seems to work.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.