I’ve been trying to understand the approach for minibatch training with a DGLGraph. It seems the recommended approach is to use dgl.batch(<graph_list>) which then creates a completely new graph and combines each graphs ndata and edata.
What is confusing here, is that looking at the implementation for GraphConv, I’m to provide a graph and node features. If I do this with a batched graph I would expect to provide the node features with an additional batch dimension, but this is not supported.
It seems when working with batched graphs, we should have all the features in the graph before batching, which is a completely different idiom from what’s recommended for non-batched graphs. The problem with doing this, is it seems the only way to get results back from a batched graph is to use the provided readout functions. In my case, I want a per node value with dimensions (batch, node, features). The only way I can see to do this is to return the graph, call dgl.unbatch(graph) and then extract the node values. Not only is this cumbersome, but I’m not sure if it causes any problems for the autograd engine (using Pytorch)
TLDR: Want to be able to provide batch graphs and batched features to a GraphConv and get batched features back for training.
Additional Point: I can get features back from BatchedGraph but I’m not sure how to reshape the data from (B*N, F) to (B, N, F) since I don’t know for sure the order that batching is done.