Update all nodes embeddings for prediction after minibatch training

Hi. After training a GNN for Link Prediction with minibatch Neighborhood Sampling following 6.3 Training GNN for Link Prediction with Neighborhood Sampling — DGL 0.6.1 documentation,
with a graph like
Graph(num_nodes={'post': 11911, 'user': 9482}, num_edges={('post', 'liked', 'user'): 363984, ('user', 'like', 'post'): 363984}, metagraph=[('post', 'user', 'liked'), ('user', 'post', 'like')])

since this is a recsys, I would like to predict new edges for this graph. Problem is that I dont know how to update those nodes embeddings for the full graph, since my model takes minibatches of it.

My idea is something like:

new_feats = model.rgcn(graph, feats)

but since the stochastical r-gcn takes blocks it doesnt work that easy.

How could I solve it? Thanks!

This should be handled automatically by PyTorch autograd system. Suppose you have a node embedding created by torch.nn.Embedding, after you’ve sampled a mini-batch graph, you could use the input_nodes data to slice the embeddings and pass them to your GNN model. Something like:

for mini_batch in dataloader:
    input_nodes, blocks, pos_graph, neg_graph = mini_batch
    emb = node_embdding[input_nodes]
    new_feats = model.rgcn(blocks, emb)
    ...

In this way, the gradient will be computed for the node_embeddings and will be used to update the embeddings during optimizer.step.

1 Like

Thank you for your answer. Since every block containt just N (batch_size) feats, how could I map those new_feats to their index/id? Because in a given batch, if I print:

for input_nodes, positive_graph, negative_graph, blocks in dataloader:
    blocks = [b.to(device) for b in blocks]
    post_features = blocks[0].srcdata['feats']['post'].to(device)
    user_features = blocks[0].srcdata['feats']['user'].to(device)
    input_features = {'post':post_features, 'user': user_features}
    x = model.rgcn(blocks, input_features)
    posts = x['post'].tolist()
    post_ids = input_nodes['post'].tolist()
    print(len(posts), len(post_ids))

I get different lengths: 2920 11898. Shouldnt’ they have the same length? Didn’t they correspond to the indexes and the respective indexes? Else, how could I know which index correspond a given tensor?

Thanks!!!

No. The posts are looking at output nodes in the MFG and the post_ids are input nodes of MFGs i.e., all 0 to k-hop neighbours based on layers in the model.

blocks.dstdata[dgl.NID]

1 Like

Thanks, but I get AttributeError: 'list' object has no attribute 'dstdata'

Sorry.
blocks[-1].dstdata[dgl.NID]

1 Like

Hi again. This way keeps giving me memory issues. Code:

def mini_batch_prediction(g:dgl.heterograph, model:torch.nn.Module, device:torch.device):

    eid_dict = {canonical_etype: torch.arange(g.num_edges(canonical_etype[1]), dtype=torch.int64) for canonical_etype in g.canonical_etypes} 

    neighbor_sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
    negative_sampler=dgl.dataloading.negative_sampler.Uniform(1)
    
    dataloader = dgl.dataloading.EdgeDataLoader(g, eid_dict, neighbor_sampler,
                                                        negative_sampler=negative_sampler,
                                                        batch_size=32,
                                                        shuffle=True,
                                                        drop_last=False,
                                                        num_workers=0)

    post_new_feats = []
    user_new_feats = []

    for input_nodes, positive_graph, negative_graph, blocks in dataloader:
        blocks = [b.to(torch.device('cuda')) for b in blocks]
        post_features = blocks[0].srcdata['feats']['post'].to(device)
        user_features = blocks[0].srcdata['feats']['user'].to(device)
        input_features = {'post':post_features, 'user': user_features}
        x = model.rgcn(blocks, input_features)
        user_feats = x['user']
        user_idxes = blocks[-1].dstdata[dgl.NID]['user']
        for i, user in enumerate(user_idxes):
            user_new_feats.append((user, user_feats[i]))

Error:

Device: cuda:0
/opt/conda/lib/python3.7/site-packages/dgl/base.py:45: DGLWarning: EdgeDataLoader directly taking a BlockSampler will be deprecated and it will not support feature prefetching. Please use dgl.dataloading.as_edge_prediction_sampler to wrap it.
  return warnings.warn(message, category=category, stacklevel=1)
Traceback (most recent call last):
  File "main.py", line 85, in <module>
    mini_batch_inference=True)
  File "main.py", line 66, in main
    preds = mini_batch_prediction(graph, model, device)
  File "/code/dgl_recommendator/inference.py", line 117, in mini_batch_prediction
    x = model.rgcn(blocks, input_features)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/code/dgl_recommendator/models.py", line 61, in forward
    x = self.conv1(blocks[0], x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/dgl/nn/pytorch/hetero.py", line 178, in forward
    **mod_kwargs.get(etype, {}))
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/dgl/nn/pytorch/conv/graphconv.py", line 423, in forward
    graph.update_all(aggregate_fn, fn.sum(msg='m', out='h'))
  File "/opt/conda/lib/python3.7/site-packages/dgl/heterograph.py", line 4876, in update_all
    ndata = core.message_passing(g, message_func, reduce_func, apply_node_func)
  File "/opt/conda/lib/python3.7/site-packages/dgl/core.py", line 357, in message_passing
    ndata = invoke_gspmm(g, mfunc, rfunc)
  File "/opt/conda/lib/python3.7/site-packages/dgl/core.py", line 332, in invoke_gspmm
    z = op(graph, x)
  File "/opt/conda/lib/python3.7/site-packages/dgl/ops/spmm.py", line 189, in func
    return gspmm(g, 'copy_lhs', reduce_op, x, None)
  File "/opt/conda/lib/python3.7/site-packages/dgl/ops/spmm.py", line 77, in gspmm
    lhs_data, rhs_data)
  File "/opt/conda/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 757, in gspmm
    return GSpMM.apply(gidx, op, reduce_op, lhs_data, rhs_data)
  File "/opt/conda/lib/python3.7/site-packages/torch/cuda/amp/autocast_mode.py", line 118, in decorate_fwd
    return fwd(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 126, in forward
    out, (argX, argY) = _gspmm(gidx, op, reduce_op, X, Y)
  File "/opt/conda/lib/python3.7/site-packages/dgl/sparse.py", line 233, in _gspmm
    arg_e_nd)
  File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.__call__
  File "dgl/_ffi/_cython/./function.pxi", line 232, in dgl._ffi._cy3.core.FuncCall
  File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL
dgl._ffi.base.DGLError: [10:14:22] /opt/dgl/src/runtime/cuda/cuda_device_api.cc:97: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: out of memory
Stack trace:
  [bt] (0) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7f5e5a28149f]
  [bt] (1) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::runtime::CUDADeviceAPI::AllocDataSpace(DLContext, unsigned long, unsigned long, DLDataType)+0x108) [0x7f5e5a759748]
  [bt] (2) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::Empty(std::vector<long, std::allocator<long> >, DLDataType, DLContext)+0x351) [0x7f5e5a5c7a71]
  [bt] (3) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::NewIdArray(long, DLContext, unsigned char)+0x6d) [0x7f5e5a252b4d]
  [bt] (4) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::runtime::NDArray dgl::aten::impl::Range<(DLDeviceType)2, long>(long, long, DLContext)+0x9a) [0x7f5e5a77407a]
  [bt] (5) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::Range(long, long, unsigned char, DLContext)+0x1fd) [0x7f5e5a252edd]
  [bt] (6) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(std::pair<dgl::runtime::NDArray, dgl::runtime::NDArray> dgl::aten::impl::Sort<(DLDeviceType)2, long>(dgl::runtime::NDArray, int)+0x50) [0x7f5e5a7823b0]
  [bt] (7) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::Sort(dgl::runtime::NDArray, int)+0x21a) [0x7f5e5a26575a]
  [bt] (8) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(void dgl::aten::impl::COOSort_<(DLDeviceType)2, long>(dgl::aten::COOMatrix*, bool)+0x5b) [0x7f5e5a78caeb]

Isn’t there any other way to do this in a less memory consuming way?

You can change to different sampler like NeighborSampler or reduce batch_size

1 Like

I tried both, but still same OOM issue.

Also, what is the point of excluding reverse edges? (my graph has just two edge types, they are the reverse of each other)

Thank you!

  1. What have you set up for number of neighbours?
  2. Have you used torch.no_grad(), where you called this function?
  3. Also check what was your free gpu memory before running the prediction function?

where are we excluding? are we talking about RGCN here as in your code?
Edit - Oh, you mean edge_dir in sampler?. The outgoing edges are considered too; think like every edge is considered atleast once over the entire graph

Quite small.

negative_sampler=dgl.dataloading.negative_sampler.Uniform(1)
sampler = dgl.dataloading.as_edge_prediction_sampler(dgl.dataloading.NeighborSampler([1,1]),
                                                    negative_sampler=negative_sampler)
dataloader = dgl.dataloading.EdgeDataLoader(
g, train_eid, sampler,
batch_size=8, shuffle=True, drop_last=False, num_workers=0)

Yes.

Not sure how to check that.

I was refering to the docs: dgl.dataloading.EdgeDataLoader — DGL 0.8.0post2 documentation

Thank you so much…

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.