Input features dimension on HeteroGraphConv

Hi,
I am building a recommender system using GNN and DGL. My graph is heterogeneous : I have 3 types of nodes (‘user’, ‘item’, ‘sport’), and 6 types of relations (user - buys - item, item - boughtby - user, user - practices - sport, etc.).

My main model consists of multiple layers of HeteroGraphConv. In the HeteroGraphConv, I specify a custom ConvLayer for each different relations.

This ConvLayer takes as input a tuple of input features dimensions (dimension of node features, dimension of neighbours nodes features), and output features dimension (dimension of the node embedding).

class ConvLayer(nn.Module):
    def __init__(self,
             in_feats, 
             out_feats):
        super(ConvLayer, self).__init__()
        self._in_self_feats, self._in_neigh_feats = in_feats
        self._out_feats = out_feats
        self.fc_self = nn.Linear(self._in_self_feats, out_feats)
        self.fc_neigh = nn.Linear(self._in_neigh_feats, out_feats)
    
    def forward(self, graph, x):
        print(self._in_self_feats, self._in_neigh_feats)
        h_neigh, h_self = x
        graph.srcdata['h'] = h_neigh   
    
        graph.update_all( 
            fn.copy_src('h', 'm'),
            fn.mean('m', 'neigh')) 
        h_neigh = graph.dstdata['neigh'] 
        
        z = self.fc_self(h_self) + self.fc_neigh(h_neigh)
        return z

layer = dglnn.HeteroGraphConv({'buys' : ConvLayer((user_dim, item_dim), hidden_dim), #5 
                               'bought-by' : ConvLayer((item_dim, user_dim), hidden_dim), #1
                               'utilized-for' : ConvLayer((item_dim, sport_dim), hidden_dim), #2 
                               'utilizes' : ConvLayer((sport_dim, item_dim), hidden_dim), #4 
                               'practices' : ConvLayer((user_dim, sport_dim), hidden_dim), #6 
                               'practiced-by' : ConvLayer((sport_dim, user_dim), hidden_dim)}, #3 
                                     aggregate='sum')

However, when I run my training, and I try to print the self._in_self_feats, self._in_neigh_feats, I do not get the right input dimensions.

E.g., ‘utilized-for’ seems to be the second relation type to be considered by HeteroGraphConv. The input dimension should be item dimension, sport dimension. But when I print them, I get sport dimension, user dimension.

Thanks in advance!

1 Like
import dgl
import dgl.function 
import dgl.nn.pytorch as dglnn
import torch
import torch.nn as nn

user_dim = 2
item_dim = 3
hidden_dim = 4
sport_dim = 5

g = dgl.heterograph(
    {
        ('user', 'buys', 'item'): (torch.tensor([0, 1]), torch.tensor([1, 2])),
        ('item', 'bought-by', 'user'): (torch.tensor([1, 3]), torch.tensor([1, 4])),
        ('user', 'practices', 'sport'): (torch.tensor([1, 1]), torch.tensor([1, 2])),
        ('sport', 'practiced-by', 'user'): (torch.tensor([1, 3, 4]), torch.tensor([2, 6, 7])),
        ('sport', 'utilizes', 'item'): (torch.tensor([1, 2, 3]), torch.tensor([3, 4, 5])),
        ('item', 'utilized-for', 'sport'): (torch.tensor([1]), torch.tensor([2]))
    }
)

class ConvLayer(nn.Module):
    def __init__(self, in_feats, out_feats):
        super(ConvLayer, self).__init__()
        self._in_neigh_feats, self._in_self_feats = in_feats
        self._out_feats = out_feats
        self.fc_self = nn.Linear(self._in_self_feats, out_feats)
        self.fc_neigh = nn.Linear(self._in_neigh_feats, out_feats)
    
    def forward(self, graph, x):
        print(self._in_self_feats, self._in_neigh_feats)
        h_neigh, h_self = x
        graph.srcdata['h'] = h_neigh   
    
        graph.update_all( 
            fn.copy_src('h', 'm'),
            fn.mean('m', 'neigh')) 
        h_neigh = graph.dstdata['neigh'] 
        
        z = self.fc_self(h_self) + self.fc_neigh(h_neigh)
        return z

layer = dglnn.HeteroGraphConv({'buys' : ConvLayer((user_dim, item_dim), hidden_dim), #5 
                               'bought-by' : ConvLayer((item_dim, user_dim), hidden_dim), #1
                               'utilized-for' : ConvLayer((item_dim, sport_dim), hidden_dim), #2 
                               'utilizes' : ConvLayer((sport_dim, item_dim), hidden_dim), #4 
                               'practices' : ConvLayer((user_dim, sport_dim), hidden_dim), #6 
                               'practiced-by' : ConvLayer((sport_dim, user_dim), hidden_dim)}, #3 
                               aggregate='sum')
feats = {'user': torch.randn(g.num_nodes('user'), user_dim), 'item': torch.randn(g.num_nodes('item'), item_dim), 'sport': torch.randn(g.num_nodes('sport'), sport_dim)}
layer(g, (feats, feats))

The code above should work. There are two issues going on here:

  1. HeteroGraphConv expects either that all modules take one tensor in forward or that all modules take a pair of tensors in forward. In the latter case, which is the case of ConvLayer, it expects to take a pair of dict[str, Tensor] for the forward function. That says, you need to replace layer(g, feats) with layer(g, (feats, feats)). Unfortunately this has not been mentioned anywhere in the doc, which should be fixed.
  2. With self._in_self_feats, self._in_neigh_feats = in_feats, self._in_self_feats should be the size for destination features and self._in_neigh_feats should be the size for source features and you need to reverse the order of feature sizes in instantiating ConvLayer.
1 Like

Thanks @mufeili for the clear answer!

Hi @mufeili I am a bit confused in this part, why we need to send a pair of (feats, feats)?

Also, in the doc of heterograph, it is written as

Call forward with a pair of inputs is allowed and each submodule will also be invoked with a pair of inputs.

x_src = {‘user’ : …, ‘store’ : …}
x_dst = {‘user’ : …, ‘game’ : …}
y_dst = conv(g, (x_src, x_dst))
print(y_dst.keys())
dict_keys([‘user’, ‘game’])

Why do we needd x_src and x_dist?

You don’t necessary need to send a pair of (feats, feats). The reason of using (feats, feats) rather than feats is that there can be cases where we want to differentiate the nodes that are source of some edges from the nodes that are destination of some edges. This is particularly helpful in sampling-based training. See also user guide 6.1. In such cases, the underlying NN modules require passing the features of source nodes and destination nodes separately.