Hi,
I’m having some issues training a link prediction model on a heterograph using the edge data loader. Specifically, I have a graph with two types of nodes source
and user
, with the relation that a user is follower
of a source. The source has a feature called source_embedding
with dimension 750 and the user
has user_embedding
feature with dimension 800. I’ve been able to construct my graph successfully in DGL.
I’m not sure what exactly should be the inputs to my RGCN model should be. In the EdgeDataLoader, I can see that the features I am interested in (user_embedding
, source_embedding
) are there in blocks
. In the Link Prediction example (https://docs.dgl.ai/guide/training-link.html), they pass in a dictionary of node features and edge types. The batch link prediction example (https://docs.dgl.ai/guide/minibatch-link.html) seems to be off, as they don’t even pass in the positive/negative graph to the model, which leads to my confusion. I haven’t been able to find an example that does what the non-batch link prediction guide does in a batch way.
I have features for both source
and user
. I can pass in the blocks and a node feature like so:
pos_score, neg_score = model(positive_graph, negative_graph, blocks, node_features, ('source', 'has_follower', 'user'))
However, what would node_features
be? According to the non-batch link prediction example, it should be something like below. Are edge features not included?
node_features = {'source': blocks[0].srcdata['source'], 'user': blocks[0].srcdata['user']}
Further, what would the dimension of in_features
be in my model since the features of both my nodes have different dimension?
Finally, how would the forward function look of the model? Would it iterate over edge types as in the non-batch link prediction example? Maybe something like below? But then how does it handle the various edge types? Are node_features constructed correctly?
class HeteroScorePredictor(nn.Module):
def forward(self, edge_subgraph, x):
with edge_subgraph.local_scope():
edge_subgraph.ndata['x'] = x
for etype in edge_subgraph.canonical_etypes:
edge_subgraph.apply_edges(
dgl.function.u_dot_v('x', 'x', 'score'), etype=etype)
return edge_subgraph.edata['score']
class TestModel(nn.Module):
def __init__(self, in_features, hidden_features, out_features, rel_names):
super().__init__()
self.sage = FakeNewsRGCN(in_features, hidden_features, out_features, rel_names)
self.pred = HeteroScorePredictor()
def forward(self, g, neg_g, blocks, x, etype):
# run both part of the graph through it given the input feature
h = self.sage(blocks, x)
return self.pred(g, h, etype), self.pred(neg_g, h, etype)
pos_score, neg_score = model(positive_graph, negative_graph, blocks, node_features, ('source', 'has_follower', 'user'))