Edges to include in training of link prediction for heterogeneous graphs

jedbl · September 16, 2020, 3:09pm

Hi,
I am building a recommender system using GNN and DGL. My graph is heterogeneous : I have 3 types of nodes (‘user’, ‘item’, ‘sport’), and 6 types of relations (user - buys - item, item - boughtby - user, user - practices - sport, etc.).

To do the training, I am using the EdgeDataLoader. In this, I specify my edge ids on which I wish to train my model. Preliminarily, I divided my edge ids into train, test & validation set.

To update the weights of my model, I use a max margin loss function, which takes the score on the positive edges (only the interesting edges, i.e. user - buys - item), and on the negative edges.

My question is: when specifying the training edges in EdgeDataLoader, should I only include ‘interesting edges’, i.e. user - buys - item, or should I include all the edges?

I am having trouble understanding what would be the difference between including all the edges vs only including user-buys-item edges.

My understanding is that the EdgeDataLoader will create blocks that include all the necessary edges and nodes. Thus, I would only need user - buys - item edges. If I were to include all the edges in the training edges, then the EdgeDataLoader would create blocks that includes unnecessary edges and nodes. Is that correct?

Thanks in advance!

BarclayII · September 17, 2020, 4:08am

Let’s say you have 6 relation types and you only wish to train on one of them. The graph you give to the EdgeDataLoader must contain all relation types, so that all relations are considered when EdgeDataLoader constructs the neighborhood for each GNN layer. The list of edges you give would have only one relation type, so that only the edges of that particular type are iterated over and have their negative examples constructed.

g = dgl.heterograph({
    ('user', 'buys', 'item'): ...,
    ('user', 'practices', 'sport'): ...,
    ...
    })
train_eids = {('user', 'buys', 'item'): ...}
dataloader = EdgeDataLoader(g, train_eids, sampler, ...)

jedbl · September 17, 2020, 1:43pm

Thank you for the clear answer!

hockeybro12 · September 23, 2020, 5:16am

Hi,

I am having the an issue with this. I have a graph which has 2 source node types and 1 edge type. The structure is like this: ('source', 'has_follower', 'user').

I was able to build this graph successfully on top of DGLBuiltinDataset and I can get various edge properties from it/etc. Each source/user node in this graph has a feature, which are of different dimensions (all source nodes have one dimension and all user nodes do as well).

I am now trying to train this for link prediction. I created the EdgeDataLoader like so:

sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
# g is the graph
train_eid_dict = {('source', 'has_follower', 'user'): g.edges(etype='has_follower')}
dataloader = dgl.dataloading.EdgeDataLoader(g, ('source', 'has_follower', 'user'), sampler, negative_sampler=dgl.dataloading.negative_sampler.Uniform(5), batch_size=args.batch_size, shuffle=True, drop_last=False, pin_memory=True, num_workers=args.num_workers)

....
for input_nodes, positive_graph, negative_graph, blocks in dataloader:

When running that last line to load the graph, I am getting an error saying ValueError: only one element tensors can be converted to Python scalars, trace is at the bottom. I would like to note that so far I am not using any training/test/validation masks in the creation of my dataset, as I want to train this graph for link prediction in an unsupervised way and just generate an embedding.

I’m not sure what the issue is here or how I can debug it. Have I created the EdgeDataLoader eid dictionary incorrectly?

Traceback (most recent call last):
  File "GNN_model.py", line 285, in <module>
    main(args)
  File "GNN_model.py", line 225, in main
    for input_nodes, positive_graph, negative_graph, blocks in dataloader:
  File "path/lib/python3.6/site-packages/dgl/dataloading/pytorch/__init__.py", line 161, in __next__
    input_nodes, pair_graph, neg_pair_graph, blocks = next(self.iter_)
  File "path/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "path/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "path/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "path/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "path/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "path/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "path/lib/python3.6/site-packages/dgl/dataloading/pytorch/__init__.py", line 133, in collate
    input_nodes, pair_graph, neg_pair_graph, blocks = super().collate(items)
  File "path/lib/python3.6/site-packages/dgl/dataloading/dataloader.py", line 678, in collate
    return self._collate_with_negative_sampling(items)
  File "path/lib/python3.6/site-packages/dgl/dataloading/dataloader.py", line 600, in _collate_with_negative_sampling
    items = utils.prepare_tensor_dict(self.g_sampling, items, 'items')
  File "path/lib/python3.6/site-packages/dgl/utils/checks.py", line 70, in prepare_tensor_dict
    for key, val in data.items()}
  File "path/lib/python3.6/site-packages/dgl/utils/checks.py", line 70, in <dictcomp>
    for key, val in data.items()}
  File "path/lib/python3.6/site-packages/dgl/utils/checks.py", line 37, in prepare_tensor
    data = F.tensor(data)
  File "path/lib/python3.6/site-packages/dgl/backend/pytorch/tensor.py", line 40, in tensor
    return th.as_tensor(data, dtype=dtype)
ValueError: only one element tensors can be converted to Python scalars

BarclayII · September 23, 2020, 5:33am

train_eid_dict should be a dict of edge types and edge IDs, instead of edge types and incident node pairs as you have right now.

To iterate over all edges of one edge type, you can try this:

train_eid_dict = {
    ('source', 'has_follower', 'user'): torch.arange(g.num_edges('has_follower'))}

hockeybro12 · September 23, 2020, 2:22pm

@BarclayII

Thanks for the help, that seems to work now. I have another question about how to use this EdgeDataLoader to train.

I have been following the Link Prediction docs and have set up my model like so:

class TestModel(nn.Module):
    # here we have a model that first computes the representation and then predicts the scores for the edges
    def __init__(self, in_features, hidden_features, out_features, rel_names):
        super().__init__()
        self.sage = TestRGCN(in_features, hidden_features, out_features, rel_names)
        self.pred = HeteroScorePredictor()
    def forward(self, g, neg_g, blocks, etype):
        h = self.sage(g, blocks)
        return self.pred(g, h, etype), self.pred(neg_g, h, etype)

class TestRGCN(nn.Module):
    def __init__(self, in_feats, hid_feats, out_feats, rel_names, n_layers=2, dropout=0.25, activation=None):
        super(FakeNewsRGCN, self).__init__()
        self.conv1 = dglnn.HeteroGraphConv({
                rel : dglnn.GraphConv(in_feats, hid_feats, norm='right')
                for rel in rel_names
            })
        self.conv2 = dglnn.HeteroGraphConv({
                rel : dglnn.GraphConv(hid_feats, out_feats, norm='right')
                for rel in rel_names
            })
    def forward(self, blocks, x):
        x = self.conv1(blocks[0], x)
        x = self.conv2(blocks[1], x)
        return h

And then I am training it like above:

source_feats = g.nodes['source'].data['source_embedding']
user_feats = g.nodes['user'].data['user_embedding']
node_features = {'source': source_feats, 'user': user_feats}

# I'm not sure exactly what in_features value should be here since source_feats and user_feats have different dimensions
model = TestModel(in_features=800, hidden_features=512, out_features=256, rel_names=g.etypes)

...
for input_nodes, positive_graph, negative_graph, blocks in dataloader:
    blocks = [b.to(torch.device('cuda')) for b in blocks]
    positive_graph = positive_graph.to(torch.device('cuda'))
    negative_graph = negative_graph.to(torch.device('cuda'))
    input_features = blocks[0].srcdata['features']
    
    pos_score, neg_score = model(positive_graph, negative_graph, input_features, ('source', 'has_follower', 'user'))
    loss = compute_loss(pos_score, neg_score)

When running this, I am getting a key error at blocks in the forward operation of TestRGCN:

    x = self.conv1(blocks[0], x)
  File "pathlib/python3.6/site-packages/dgl/heterograph.py", line 1968, in __getitem__
    raise DGLError('Invalid key "{}". Must be one of the edge types.'.format(orig_key))
dgl._ffi.base.DGLError: Invalid key "0". Must be one of the edge types.

I’m not sure what the issue is here. When I print blocks, I can see there are two of them in there and they have some data. I’m also not sure what my input_feats value should be since I have two input features.

[Block(num_src_nodes={'source': 188, 'user': 60},
      num_dst_nodes={'source': 188, 'user': 60},
      num_edges={('source', 'has_follower', 'user'): 271},
      metagraph=[('source', 'user', 'has_follower')]), Block(num_src_nodes={'source': 188, 'user': 60},
      num_dst_nodes={'source': 10, 'user': 60},
      num_edges={('source', 'has_follower', 'user'): 271},
      metagraph=[('source', 'user', 'has_follower')])]

hockeybro12 · September 23, 2020, 11:00pm

@BarclayII

I was able to figure out the above issue. However, I’m still not sure what exactly should be the inputs to my model. In the EdgeDataLoader, I can see that the features I am interested in are there in blocks. In the Link Prediction example (https://docs.dgl.ai/guide/training-link.html), they pass in a dictionary of node features and edge types. The batch link prediction example (https://docs.dgl.ai/guide/minibatch-link.html) seems to be off, as they don’t even pass in the positive/negative graph to the model.

I have features for both source and user. I can pass in the blocks and a node feature like so:

pos_score, neg_score = model(positive_graph, negative_graph, blocks, node_features)

However, what would node_features be? According to the non-batch link prediction example, it should be something like below. Are edge features not included?

node_features = {'source': blocks[0].srcdata['source'], 'user': blocks[0].srcdata['user']}

Further, what would the dimension of in_features be in my model since the features of both my nodes have different dimension?

Finally, how would the forward function look? Would it iterate over edge types as in the non-batch link prediction example? Maybe something like below? But then how does it handle the various edge types?

    def forward(self, g, neg_g, blocks, x):
        # run both part of the graph through it given the input feature
        h = self.sage(blocks, x)
        return self.pred(g, h), self.pred(neg_g, h)

BarclayII · September 28, 2020, 7:41am

The discussion is continued here: Trouble Training Link Prediction on Heterograph with EdgeDataLoader

sopkri · October 2, 2020, 3:03pm

@hockeybro12 @BarclayII
Hi there! I’ve got a similar issue where I am using a ScorePredictor for Link Prediction on heterographs. Could you explain how you defined the HeteroScorePredictor() here? Because from the Tutorial 6.3, the implementation for ScorePredictor does not seem to work for heterographs with multiple node types:

class ScorePredictor(nn.Module):
    def forward(self, edge_subgraph, x):
        with edge_subgraph.local_scope():
            edge_subgraph.ndata['x'] = x
            for etype in edge_subgraph.canonical_etypes:
                edge_subgraph.apply_edges(
                    dgl.function.u_dot_v('x', 'x', 'score'), etype=etype)
            return edge_subgraph.edata['score']

hockeybro12 · October 2, 2020, 10:47pm

Hi @sopkri

The documentation is outdated there. I actually created a new thread here (Trouble Training Link Prediction on Heterograph with EdgeDataLoader), you can see if this helps you. If not, you can respond there and I can answer, as it is working for me now.

I think you have done it correctly, what issues are you facing? Maybe it’s related to something else?