Merging multiple graphs into a single graph

Hi all,

Is there an easy/efficient way to merge multiple graphs into a single graph? I considered using BatchedDGLGraph, but I would need to add some new edges between the nodes of the different sub-graphs afterwards and BatchedDGLGraph is read-only.

Thank you

Hi @acho,
I suggest you using dgl.batch but you mentioned you need to add new edges. My question is how would you like to initialize edge features of new edges.

I’m refactoring the code to merge BatchedDGLGraph with DGLGraph and ideally you would be able to add new edges for batched graphs, I’m interested in how you dealing with these new features.

Hi. I assumed that they could be initialised the same way they are initialised in a regular DGLGraph. I think I am probably not understanding the problem.

I mean when edges in your DGLGraph already have attributes, newly added edges need to initialize their attributes (by default we initialize them as all zero). Ignore that if that’s not important in your case.

We will provide a flatten interface for batched graph, means regarding the batched graph as a single graph, and you can mutate on this new graph.

Yes, copying already existing edge attributes and initialising new edge attributes as zero seems pretty reasonable.

Thank you.

@acho, we have merged DGLGraph and BatchedDGLGraph and provided the flatten api in the master branch. You can try this new feature by install from the source code in the master branch or pip install the nightly build version by:

Now you can add new nodes/edges on the batched graph:

>>> import dgl
>>> import torch
>>> g = dgl.DGLGraph()
>>> g.add_nodes(3)
>>> g.add_edges([0,1,2],[1,2,0])
>>> g.ndata['h'] = torch.ones(3, 5)
>>> g1 = dgl.DGLGraph()
>>> g1.add_nodes(4)
>>> g1.add_edges([0,1,2,3],[0,1,2,3])
>>> g1.ndata['h'] = torch.ones(4, 5) * 2
>>> large_g = dgl.batch([g, g1])
>>> large_g.ndata
{'h': tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.]])}
>>> large_g.batch_size
2
>>> large_g.batch_num_nodes
[3, 4]
>>> large_g.batch_num_edges
[3, 4]

Note that you can add/remove nodes/edges on large_g directly, but you will receive a warning

>>> large_g.add_nodes(5)
/Users/###/dgl/python/dgl/base.py:25: UserWarning: The graph has batch_size > 1, and mutation would break batching related properties, call `flatten` to remove batching information of the graph.
  warnings.warn(msg, warn_type)

To depress the warning, you can call large_g.flatten to make it a single graph:

>>> large_g.flatten()
>>> large_g.batch_size
1
>>> large_g.batch_num_nodes
[7]
>>> large_g.batch_num_edges
[7]
>>> large_g.add_nodes(5)
>>> large_g.ndata
{'h': tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])}

I hope this could satisfy your needs.

1 Like

@zihao, thank you very much for your work and for the detailed explanation! It definitely satisfies my needs and probably others.