Batch graphs along feature dimension

pdeibert · May 25, 2024, 6:34pm

Hey everyone!

I’m currently using dgl.batch to batch a large number of graphs together. I know the default behavior is to create a single (flat) disjointed graph. However, I was wondering if there is a way to batch multiple graphs with identical topology and feature shapes along the feature dimension or possibly a new feature dimension.
So instead of offsetting the node and edge ids, the exact same node and edge of different graphs are accessed by the same ids, but different indices/dimensions along the features. I know it is a very specific use-case, but it would drastically reduce the number of edge indices necessary to encode a batched graph, thus reducing its memory footprint.
I know that PyTorch Geometric offers a way to do so by overriding __cat_dim__(), but so far I couldn’t find a way to do so in DGL. It shouldn’t be too hard to work around this limitation, but I was wondering if such functionality exists in DGL.

BarclayII · May 30, 2024, 2:09am

DGL graphs’ node and edge features support multiple dimensions, so you can just stack them:

new_g = gs[0].clone()
new_g.ndata['stacked_feature'] = torch.stack([g.ndata['feature'] for g in gs], 0)

pdeibert · May 30, 2024, 4:15pm

That’s what I did as a workaround now. I was just curious if DGL provided any functionality to handle it for me, instead of having to do it all manually, but it seems like that is not the case.

system · June 29, 2024, 4:15pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.