Convolutions with mini-batches of heterogeneous graph

sopkri · September 30, 2020, 1:53pm

Hi there!

I am using the RGCN implementation for heterogeneous graphs and I have implemented mini-batching. The problem right now is that in every convolution step all of the nodes of the graph for every node type (meaning the feature tensor of all nodes) is being forwarded instead of only the batched nodes. This is why it throws me the following error:

File "/Users/sophiakrix/Envs/deeplink/lib/python3.7/site-packages/dgl/heterograph.py", line 3752, in _set_n_repr
    ' Got %d and %d instead.' % (nfeats, num_nodes))
dgl._ffi.base.DGLError: Expect number of features to match number of nodes (len(u)). Got 372 and 26 instead.

In this case 372 is the total number of nodes of node type A and 26 is the number of nodes of node type A in the mini-batch.

Can you tell me how I can only forward the feature information from the mini-batch and not from the entire graph? That would be of a lot of help!

sopkri · September 30, 2020, 1:55pm

Just to give a code snippet, this is how the forwarding is happening:

# g is the entire graph with all the nodes of different node types
self.embed_layer = RelGraphEmbed(g=g, embed_size=self.h_dim)


    def forward(self, batched_graph):
        batched_graph = batched_graph.local_var()
        hs = {}
        hs = self.embed_layer(batched_graph)
        for idx, layer in enumerate(self.layers):
            hs = layer.forward(batched_graph, inputs=hs, layer_number=idx)
        return hs

mufeili · September 30, 2020, 6:29pm

How did you implement the mini-batching? Have you checked our user guide for mini-batch training on heterogeneous graphs?

sopkri · October 1, 2020, 8:32am

@mufeili Yes I’ve seen the Minibatch Tutorial here, thanks for the pointer. Could you explain to me why we need input and output nodes here and how we have to specify them in the code?
This is something that is not obvious for me right now.

sopkri · October 1, 2020, 8:39am

@mufeili If I am using a heterogeneous graph with multiple node types and multiple edge types, can the mini-batch training still work?

Because I found the requirements on the Mini-Batch Tutorial stating that:

All message passing modules in DGL work on homogeneous graphs, unidirectional bipartite graphs (that have two node types and one edge type), and a block with one edge type. Essentially, the input graph and feature of a builtin DGL neural network module must satisfy either of the following cases.

If the input feature is a pair of tensors, then the input graph must be unidirectional bipartite.

If the input feature is a single tensor and the input graph is a block, DGL will automatically set the feature on the output nodes as the first few rows of the input node features.

If the input feature must be a single tensor and the input graph is not a block, then the input graph must be homogeneous.

If I understand this correctly, the only option right now is to have one edge type in the block for me. So how can I batch the heterogeneous graph so that the batched subgraph only has one edge type?

sopkri · October 1, 2020, 10:29am

@mufeili I am trying to implement now the CustomHeteroGraphConv into the RelGraphConvLayer.
So when I replace this line here where the dglnn.HeteroGraphConv is initialised:

self.conv = dglnn.HeteroGraphConv({
                rel : dglnn.GraphConv(in_feat, out_feat, norm='right', weight=False, bias=False)
                for rel in rel_names
            })

with the CustomHeteroGraphConv, I need as input the graph g:

self.conv = dglnn.HeteroGraphConv({
                rel : CustomHeteroGraphConv(g=self.g, in_feats=self.in_feat, out_feats=self.out_feat)
                for rel in rel_names
            })

Should this g be the entire graph or just the batched graph?

mufeili · October 1, 2020, 1:55pm

Why do you need to pass g in instantiating the conv module?
If you are working with multiple small graphs (e.g. graph classification), then the input to the conv module should be a batched graph for each iteration. If you are working with a large graph, then the input to the conv module should be a block.

sopkri · October 1, 2020, 1:59pm

@mufeili

As I understand from this code here which is from this tutorial about mini batching with heterogeneous graphs, the CustomHeteroGraphConv has an input parameter g.

import torch.nn as nn
import dgl.function as fn


class CustomHeteroGraphConv(nn.Module):
    def __init__(self, g, in_feats, out_feats):
        super().__init__()
        self.Ws = nn.ModuleDict()
        for etype in g.canonical_etypes:
            utype, _, vtype = etype
            self.Ws[etype] = nn.Linear(in_feats[utype], out_feats[vtype])
        for ntype in g.ntypes:
            self.Vs[ntype] = nn.Linear(in_feats[ntype], out_feats[ntype])

    def forward(self, g, h):
        with g.local_scope():
            for ntype in g.ntypes:
                h_src, h_dst = h[ntype]
                g.dstnodes[ntype].data['h_dst'] = self.Vs[ntype](h[ntype])
                g.srcnodes[ntype].data['h_src'] = h[ntype]
            for etype in g.canonical_etypes:
                utype, _, vtype = etype
                g.update_all(
                    fn.copy_u('h_src', 'm'), fn.mean('m', 'h_neigh'),
                    etype=etype)
                g.dstnodes[vtype].data['h_dst'] = \
                    g.dstnodes[vtype].data['h_dst'] + \
                    self.Ws[etype](g.dstnodes[vtype].data['h_neigh'])
            return {ntype: g.dstnodes[ntype].data['h_dst']
                    for ntype in g.ntypes}

Question: Should the g at the __init__ function be the entire graph and the g at the forward() function be the batched graph?

I am working with one entire heterograph on which I want to do link prediction.

mufeili · October 1, 2020, 2:09pm

I see. You can pass the entire graph to __init__.
For mini-batch training on an entire graph, you need to pass a block, which is different from a batched graph. You can generate blocks with an EdgeDataLoader instance as in user guide 6.3.

sopkri · October 1, 2020, 4:14pm

@mufeili Considering I have an implementation following the tutorial of user guide 6.3 as follows:

# edge IDs used to compute the output
    train_eid_dict = {
        canonical_etype: torch.arange(training_graph.num_edges(canonical_etype[1]))
        for canonical_etype in training_graph.canonical_etypes
    }
    val_eid_dict = {
        canonical_etype: torch.arange(training_graph.num_edges(canonical_etype[1]))
        for canonical_etype in validation_graph.canonical_etypes
    }

    # train sampler
    sampler = dgl.dataloading.MultiLayerNeighborSampler([fanout] * n_layers)
    # pick one negative edge per positive edge
    neg_sampler = dgl.dataloading.negative_sampler.Uniform(1)
    train_loader = dgl.dataloading.EdgeDataLoader(
        g=heterograph.g,
        eids=train_eid_dict,  # The edge set in graph :attr:`g` to compute outputs, Tensor or dict[etype, Tensor]
        block_sampler=sampler,
        batch_size=batch_size,
        g_sampling=training_graph, # only sample from the training graph
        negative_sampler=neg_sampler,
        shuffle=True,
    )

    # validation sampler
    # we do not use full neighbor to save computation resources
    val_sampler = dgl.dataloading.MultiLayerNeighborSampler([fanout] * n_layers)
    val_loader = dgl.dataloading.EdgeDataLoader(
        g=heterograph.g,
        eids=val_eid_dict,  # The edge set in graph :attr:`g` to compute outputs, Tensor or dict[etype, Tensor]
        block_sampler=sampler,
        batch_size=batch_size,
        g_sampling=validation_graph, # only sample from the training graph
        negative_sampler=neg_sampler,
        shuffle=True,
    )

If I now want to implement the mini batching as in the tutorial, I have to change the inputs for the model, since the model I am using is a LinkPredict Model containing the RelGraphConvLayer.

    for input_nodes, positive_graph, negative_graph, blocks in train_loader:
        blocks = [b.to(torch.device('cuda')) for b in blocks]
        positive_graph = positive_graph.to(torch.device('cuda'))
        negative_graph = negative_graph.to(torch.device('cuda'))
        input_features = blocks[0].srcdata['features']
        edge_labels = edge_subgraph.edata['labels']
        edge_predictions = model(edge_subgraph, blocks, input_features)
        loss = compute_loss(edge_labels, edge_predictions)
        opt.zero_grad()
        loss.backward()
        opt.step()

I have changed the RelGraphConv at one line to use the CustomHeteroGraphConv as mentioned above:

class RelGraphConvLayer(nn.Module):
    r"""Relational graph convolution layer.
    Parameters
    ----------
    in_feat : int
        Input feature size.
    out_feat : int
        Output feature size.
    rel_names : list[str]
        Relation names.
    num_bases : int, optional
        Number of bases. If is none, use number of relations. Default: None.
    weight : bool, optional
        True if a linear layer is applied after message passing. Default: True
    bias : bool, optional
        True if bias is added. Default: True
    activation : callable, optional
        Activation function. Default: None
    self_loop : bool, optional
        True to include self loop message. Default: False
    dropout : float, optional
        Dropout rate. Default: 0.0
    """
    def __init__(self,
                 in_feat,
                 out_feat,
                 rel_names,
                 device,
                 g,
                 *,
                 weight=True,
                 bias=True,
                 activation=None,
                 self_loop=True,
                 dropout=0.0):
        super(RelGraphConvLayer, self).__init__()
        self.in_feat = in_feat
        self.out_feat = out_feat
        self.rel_names = rel_names
        self.bias = bias
        self.activation = activation
        self.self_loop = self_loop
        self.device = device
        # added self.g
        self.g = g

        # changed from regular dglnn.HeteroGraphConv to CustomHeteroGraphConv
        self.conv = dglnn.HeteroGraphConv({
                rel : CustomHeteroGraphConv(g=self.g, in_feats=self.in_feat, out_feats=self.out_feat)
                for rel in rel_names
            })


        self.use_weight = weight
        # self.use_basis = num_bases < len(self.rel_names) and weight
        if self.use_weight:
            #if self.use_basis:
            #    self.basis = dglnn.WeightBasis((in_feat, out_feat), num_bases, len(self.rel_names))
            #else:
            self.weight = nn.Parameter(th.Tensor(len(self.rel_names), in_feat, out_feat))
            nn.init.xavier_uniform_(self.weight, gain=nn.init.calculate_gain('relu'))

        # bias
        if bias:
            self.h_bias = nn.Parameter(th.Tensor(out_feat))
            nn.init.zeros_(self.h_bias)

        # weight for self loop
        if self.self_loop:
            self.loop_weight = nn.Parameter(th.Tensor(in_feat, out_feat))
            nn.init.xavier_uniform_(self.loop_weight,
                                    gain=nn.init.calculate_gain('relu'))

        self.dropout = nn.Dropout(dropout)

    def forward(self, g, inputs):
        """Forward computation
        Parameters
        ----------
        g : DGLHeteroGraph
            Input graph.
        inputs : dict[str, torch.Tensor]
            Node feature for each node type.
        Returns
        -------
        dict[str, torch.Tensor]
            New node features for each node type.
        """
        g = g.local_var()
        if self.use_weight:
            weight = self.basis() if self.use_basis else self.weight
            wdict = {self.rel_names[i] : {'weight' : w.squeeze(0)}
                     for i, w in enumerate(th.split(weight, 1, dim=0))}
        else:
            wdict = {}

        if g.is_block:
            inputs_src = inputs
            inputs_dst = {k: v[:g.number_of_dst_nodes(k)] for k, v in inputs.items()}
        else:
            inputs_src = inputs_dst = inputs

        hs = self.conv(g, inputs, mod_kwargs=wdict)

        def _apply(ntype, h):
            if self.self_loop:
                h = h + th.matmul(inputs_dst[ntype], self.loop_weight)
            if self.bias:
                h = h + self.h_bias
            if self.activation:
                h = self.activation(h)
            return self.dropout(h)
        return {ntype : _apply(ntype, h) for ntype, h in hs.items()}

Can you tell me how I need to modify this line here in order to work for the CustomHeteroGraphConv?

edge_predictions = model(edge_subgraph, blocks, input_features)

sopkri · October 1, 2020, 4:39pm

When I am trying to run this, I get an error when I am calling the train_loader:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
  File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/./lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 664, in collate
    return self._collate_with_negative_sampling(items)
  File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 590, in _collate_with_negative_sampling
    pair_graph = self.g.edge_subgraph(items, preserve_nodes=True)
  File "/./lib/python3.7/site-packages/dgl/utils/internal.py", line 898, in _fn
    return func(*args, **kwargs)
  File "/./lib/python3.7/site-packages/dgl/subgraph.py", line 279, in edge_subgraph
    sgi = graph._graph.edge_subgraph(induced_edges, preserve_nodes)
  File "./lib/python3.7/site-packages/dgl/heterograph_index.py", line 824, in edge_subgraph
    return _CAPI_DGLHeteroEdgeSubgraph(self, eids, preserve_nodes)
  File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.__call__
  File "dgl/_ffi/_cython/./function.pxi", line 222, in dgl._ffi._cy3.core.FuncCall
  File "dgl/_ffi/_cython/./function.pxi", line 211, in dgl._ffi._cy3.core.FuncCall3
  File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL
dgl._ffi.base.DGLError: [18:37:07] /tmp/dgl_src/src/array/cpu/array_index_select.cc:22: Check failed: idx_data[i] < arr_len (377591 vs. 5571) : Index out of range.
Stack trace:
  [bt] (0) 1   libdgl.dylib                        0x0000000128a6b3af dmlc::LogMessageFatal::~LogMessageFatal() + 111
  [bt] (1) 2   libdgl.dylib                        0x0000000128a8496c dgl::runtime::NDArray dgl::aten::impl::IndexSelect<(DLDeviceType)1, long long, long long>(dgl::runtime::NDArray, dgl::runtime::NDArray) + 428
  [bt] (2) 3   libdgl.dylib                        0x0000000128a40651 dgl::aten::IndexSelect(dgl::runtime::NDArray, dgl::runtime::NDArray) + 3809
  [bt] (3) 4   libdgl.dylib                        0x0000000129449f99 dgl::UnitGraph::COO::EdgeSubgraph(std::__1::vector<dgl::runtime::NDArray, std::__1::allocator<dgl::runtime::NDArray> > const&, bool) const + 489
  [bt] (4) 5   libdgl.dylib                        0x00000001294382b3 dgl::UnitGraph::EdgeSubgraph(std::__1::vector<dgl::runtime::NDArray, std::__1::allocator<dgl::runtime::NDArray> > const&, bool) const + 163
  [bt] (5) 6   libdgl.dylib                        0x000000012936110b dgl::HeteroGraph::EdgeSubgraph(std::__1::vector<dgl::runtime::NDArray, std::__1::allocator<dgl::runtime::NDArray> > const&, bool) const + 1243
  [bt] (6) 7   libdgl.dylib                        0x00000001293799ae std::__1::__function::__func<dgl::$_39, std::__1::allocator<dgl::$_39>, void (dgl::runtime::DGLArgs, dgl::runtime::DGLRetValue*)>::operator()(dgl::runtime::DGLArgs&&, dgl::runtime::DGLRetValue*&&) + 606
  [bt] (7) 8   libdgl.dylib                        0x0000000129307848 DGLFuncCall + 72
  [bt] (8) 9   core.cpython-37m-darwin.so          0x0000000117884fe1 __pyx_f_3dgl_4_ffi_4_cy3_4core_FuncCall(void*, _object*, DGLValue*, int*) + 513

@mufeili Do you know what’s wrong here?

mufeili · October 2, 2020, 10:07am

How did you define model and the associated class?

sopkri · October 2, 2020, 10:43am

@mufeili I have followed the implementation basically from the rgcn linkpredict example.
‘model’ is an instance of the LinkPredict whose forward() method is the one from BaseRGCN:

class LinkPredict(nn.Module):
    def __init__(self,
                 heterograph: Heterograph,
                 g,
                 h_dim,
                 out_dim,
                 device,
                 num_hidden_layers=1,
                 dropout=0,
                 use_self_loop=True,
                 reg_param=0,
                 ):
        super(LinkPredict, self).__init__()
        ...
        self.rgcn: BaseRGCN = BaseRGCN(
            g=g,
            h_dim=h_dim,
            out_dim=out_dim,
            device=device,
            num_hidden_layers=num_hidden_layers,
            dropout=dropout,
            use_self_loop=use_self_loop
        )
        ...

    def forward(self, g):
        return self.rgcn.forward(g)

BaseRGCN is defined in my case as follows:

class BaseRGCN(nn.Module):
    def __init__(self,
                 g,
                 h_dim,
                 out_dim,
                 device: th.device,
                 num_hidden_layers=1,
                 dropout=0,
                 use_self_loop=True):
        super(BaseRGCN, self).__init__()
        ...
        self.layers: nn.ModuleList = nn.ModuleList()
        self.embed_layer = RelGraphEmbed(g=self.g, embed_size=self.h_dim, device=self.device)

        # Embedding to Input layer
        self.layers.append(RelGraphConvLayer(
            in_feat=self.h_dim, out_feat=self.h_dim, rel_names=self.rel_names, device=device, g=self.g, #layer_number=0, last_layer=False,
            activation=F.relu, self_loop=self.use_self_loop, dropout=self.dropout)
        )

        # hidden to hidden layer
        for i in range(self.num_hidden_layers):
            self.layers.append(RelGraphConvLayer(
                in_feat=self.h_dim, out_feat=self.h_dim, rel_names=self.rel_names, device=device, g=self.g, #layer_number=1, last_layer=False,
                activation=F.relu, self_loop=self.use_self_loop, dropout=self.dropout, )
            )

        # hidden to output layer
        self.layers.append(RelGraphConvLayer(
            in_feat=self.h_dim, out_feat=self.out_dim, rel_names=self.rel_names, device=device, g=self.g, #layer_number=2, last_layer=True,
            activation=None,self_loop=self.use_self_loop,)
        )

    def forward(self, batched_graph):

        batched_graph = batched_graph.local_var()
        hs = {}
        hs = self.embed_layer(batched_graph)
        for idx, layer in enumerate(self.layers):
            hs = layer.forward(batched_graph, inputs=hs, layer_number=idx)
        return hs

Which in turn leads again to the forward() function of each layer from the RelGraphConv

class RelGraphConvLayer(nn.Module):
    r"""Relational graph convolution layer.
    Parameters
    ----------
    in_feat : int
        Input feature size.
    out_feat : int
        Output feature size.
    rel_names : list[str]
        Relation names.
    num_bases : int, optional
        Number of bases. If is none, use number of relations. Default: None.
    weight : bool, optional
        True if a linear layer is applied after message passing. Default: True
    bias : bool, optional
        True if bias is added. Default: True
    activation : callable, optional
        Activation function. Default: None
    self_loop : bool, optional
        True to include self loop message. Default: False
    dropout : float, optional
        Dropout rate. Default: 0.0
    """
    def __init__(self,
                 in_feat,
                 out_feat,
                 rel_names,
                 device,
                 g,
                 *,
                 weight=True,
                 bias=True,
                 activation=None,
                 self_loop=True,
                 dropout=0.0):
        super(RelGraphConvLayer, self).__init__()
        # added self.g
        self.g = g

        # changed from regular dglnn.HeteroGraphConv to CustomHeteroGraphConv
        self.conv = dglnn.HeteroGraphConv({
                rel[1]: CustomHeteroGraphConv(g=self.g, in_feats=self.in_feat, out_feats=self.out_feat)
                for rel in rel_names
            })

    ...

    def forward(self, g, inputs):
        """Forward computation
        Parameters
        ----------
        g : DGLHeteroGraph
            Input graph.
        inputs : dict[str, torch.Tensor]
            Node feature for each node type.
        Returns
        -------
        dict[str, torch.Tensor]
            New node features for each node type.
        """
        g = g.local_var()
        if self.use_weight:
            weight = self.basis() if self.use_basis else self.weight
            wdict = {self.rel_names[i] : {'weight' : w.squeeze(0)}
                     for i, w in enumerate(th.split(weight, 1, dim=0))}
        else:
            wdict = {}

        if g.is_block:
            inputs_src = inputs
            inputs_dst = {k: v[:g.number_of_dst_nodes(k)] for k, v in inputs.items()}
        else:
            inputs_src = inputs_dst = inputs

        hs = self.conv(g, inputs, mod_kwargs=wdict)

        def _apply(ntype, h):
            if self.self_loop:
                h = h + th.matmul(inputs_dst[ntype], self.loop_weight)
            if self.bias:
                h = h + self.h_bias
            if self.activation:
                h = self.activation(h)
            return self.dropout(h)
        return {ntype : _apply(ntype, h) for ntype, h in hs.items()}

And here now we come to the point where CustomHeteroGraphConv is used in the self.conv.

So do I understand correctly, that when

edge_predictions = model(edge_subgraph, blocks, input_features)

is called, then the forward() method from CustomHeteroGraphConv is called, and which should have input features g (the batched graph) and h ? What is h in this case?

mufeili · October 2, 2020, 11:31am

The error message suggests that an edge ID passed in creating the edge subgraph is greater than the number of edges in the graph.

sopkri · October 2, 2020, 11:54am

The number of edges in the training graph which are passed with train_eid_dict has the following form:

{('drug', 'drug-disease', 'disease'): tensor([ 0, ..., 97]), ('drug', 'drug-protein', 'protein'): tensor([  0, ..., 462]), ('protein', 'functional interaction', 'protein'): tensor([     0,      1,      2,  ..., 445296, 445297, 445298]), ('protein', 'genetic association', 'protein'): tensor([    0,     1,     2,  ..., 44316, 44317, 44318]), ('protein', 'physical interaction', 'protein'): tensor([     0,      1,      2,  ..., 176142, 176143, 176144]), ('protein', 'protein-disease', 'disease'): tensor([   0,    1,    2,  ..., 2085, 2086, 2087]), ('protein', 'signalling interaction', 'protein'): tensor([     0,      1,      2,  ..., 349482, 349483, 349484])}

(Note that this is now a different training graph with a different number of edges than the one from the error message since I just ran my script again)
To me it seems that the edge ID is probable to have this size ((377591 vs. 5571)) . So which array (idx_data) is referred to here?

mufeili · October 2, 2020, 12:22pm

Sorry for the confusion. I think there is some inconsistency across examples and user guides and we should definitely fix that.

I think there is a typo in the user guide section and edge_predictions = model(edge_subgraph, blocks, input_features) should be edge_predictions = model(positive_graph, negative_graph, blocks, input_features)
By hs = self.embed_layer(batched_graph), are you learning node embeddings from scratch, i.e. no initial node features? If this is the case, you can ignore input_features in the following points.
For LinkPredict, it should take positive_graph, negative_graph, blocks and input_features as in the Model class in user guide 6.3.
For BaseRGCN, it should take blocks and input_features as the StochasticTwoLayerGCN class in 6.3.

mufeili · October 2, 2020, 12:26pm

My guess is that idx_data is an array of edge IDs and arr_len is probably the real number of edges in the graph, which might be inferred from the edge end nodes of the graph.

sopkri · October 2, 2020, 12:43pm

Thank you @mufeili for the response!

I have real-valued numeric features for every node, and they have a different length for every node type. This is why I am first doing an embedding layer with a linear transformation to ensure they all have the same size.
So the positive_graph , negative_graph , blocks and input_features should the input parameters for the forward() function of the CustomHeteroGraphConv.
Does the forward function in turn then need to consist of two parts,

the convolution (self.gcn) and
the score prediction (self.predictor)
when we do link prediction, like it is the case in the tutorial 6.3 ?

def forward(self, positive_graph, negative_graph, blocks, x):
    x = self.gcn(blocks, x) 
    pos_score = self.predictor(positive_graph, x) 
    neg_score = self.predictor(negative_graph, x) 
    return pos_score, neg_score

How can this work for the implementation of the RelGraphConv defined above, where we call the forward() of the CustomHeteroGraphConv in 3 layers (input to hidden, hidden to hidden and hidden to output)? What we would need is this forward with the score prediction just for the last layer (when the latent feature embedding is calculated)?

mufeili · October 2, 2020, 12:51pm

For CustomHeteroGraphConv, its forward function should only take blocks and x, i.e. features for the src nodes in the first block.
Yes, you need to have input to both the graph neural network and the final predictor.
You have a model composed of two parts, one being a graph neural network, one being a predictor. The graph neural network is responsible for updating node representations for the output nodes in the final block. The predictor is responsible for scoring on node pairs based on the node representations computed from the GNN. In your case, the graph neural network will be BaseRGCN, which is based on stacking multiple GNN layers (i.e. RelGraphConvLayer instances).

sopkri · October 2, 2020, 2:28pm

I have implemented now the forward() function from the model LinkPredict to incorporate the convolution (BaseRGCN with CustomHeteroGraphConv) and the predictor:

    def forward(self, positive_graph, negative_graph, blocks, h):
        """Custom forwardm method with the BaseRGCN as the encoder and the score predictor as the decoder."""
        # graph neural network, updating node representations for the output nodes in the final block
        # dictionary with key = node type, value = tensor (blocks.dstnodes[ntype].data['h_dst']) from CustomHeteroGraphConv
        outputs = self.rgcn.forward(blocks, h)
        # predictor
        pos_score = self.predictor(positive_graph, outputs)
        neg_score = self.predictor(negative_graph, outputs)
        return pos_score, neg_score

I have defined the self.predictor as an instance of the ScorePredictor, but I’ve got the feeling that this is not going to work for heterographs (with multiply node types). Do I have to iterate through the node types here?

class ScorePredictor(nn.Module):
    def forward(self, edge_subgraph, x):
        with edge_subgraph.local_scope():
            edge_subgraph.ndata['x'] = x
            for etype in edge_subgraph.canonical_etypes:
                edge_subgraph.apply_edges(
                    dgl.function.u_dot_v('x', 'x', 'score'), etype=etype)
            return edge_subgraph.edata['score']

And to what does the x refer to in edge_subgraph.ndata['x']? Does it have to be changed to h_dst which is used in the CustomHeteroGraphConv?

    def forward(self, blocks, h):
        ...
            return {ntype: blocks.dstnodes[ntype].data['h_dst']
                    for ntype in blocks.ntypes}

Should the forward() function of the BaseRGCN then additionally take h as an input and not construct it in the function? (I commented out the lines that were used before):

def forward(self, blocks, h):
# def forward(self, g):
"""Forward function of BaseRGCN."""
        blocks = blocks.local_var()
        # g = g()
        #hs = {}
        #hs = self.embed_layer(g)
        for idx, layer in enumerate(self.layers):
            #hs = layer.forward(g, inputs=hs, layer_number=idx)
            h = layer.forward(blocks, inputs=h, layer_number=idx)
        return h