Since I am constructing a covariance matrix between all graphs in my dataset, I have to re-combine nodes in all graph-graph pairs, keep the node data and construct new in-between edges. For that I am merging two (or more graphs) with this function:
def merge_graphs(graphs, keep_edges=False, create_new_inbetween_edges=True): """ Function for merging two or more graphs Arguments --------- graphs : list of dgl.DGLGraph objects The graphs, which will be merged into one graph. Returns ------- (dgl.DGLGraph) A merged DGLGraph with same node data as original graphs Author ------ Maximillian F. Vording Inspiration ----------- njchoma url: https://discuss.dgl.ai/t/best-way-to-send-batched-graphs-to-gpu/171/6 """ g_merged = dgl.DGLGraph(graph_data=dgl.batch(graphs)) # nodes labels = graphs.node_attr_schemes() for l in labels.keys(): g_merged.ndata[l] = torch.cat([g.ndata[l] for g in graphs], 0) # edges if keep_edges: labels = graphs.edge_attr_schemes() for l in labels.keys(): g_merged.edata[l] = torch.cat([g.edata[l] for g in graphs], 0) else: g_merged.remove_edges(list(range(g_merged.number_of_edges()))) if create_new_inbetween_edges: graph_ids = tuple(range(len(graphs))) combs = list(itertools.combinations(graph_ids, 2)) num_nodes = [g.number_of_nodes() for g in graphs] new_node_inds = [ list(range(sum(num_nodes[:i]), sum(num_nodes[:i])+num_nodes[i])) for i in range(len(num_nodes)) ] for i in range(len(combs)): g_merged.add_edges( *tuple( zip(*itertools.product( new_node_inds[combs[i]], new_node_inds[combs[i]], repeat=1) ) ) ) return g_merged
I run into problems with dgl.batch() not preserving the reference to the original nodes and their tensors, so I have to construct the merged graphs every time the tensors on the original graphs are updated in each epoch. I also wanna make sure, that updates are consistent and shared for both my
BatchedDGLGraph and the original graphs in my dataset object, without having to set them explicitly as you suggest under BatchedDGLGraph/Update attributes, since this is removing the common reference to tensors, that the merged graphs have.
How can I make sure that the node data is referring back to tensors in the original graphs without having to explicitly set them for each update?
I considered using
dgl.DGLSubGraph instead, but since it does not support sharing of node/edge features for now, I’m not sure how to make that work either. When will sharing be supported?
I hope my question makes sense and if not I can elaborate with more code and explanations.
Thanks in advance