Hi, I have a
DGLHeteroGraph with 3 different types of edges. Each edge has a feature “attn_before_softmax”. For each node, I want to do softmax on all its incoming edges (3 types together). I tried the function
group_apply_edges, but it seems that it can only be applied to one type of edge at a time (i.e. it cannot do softmax on 3 types of edge together). Is there an alternative way to do that? Any suggestions or tips would be appreciated.
Hi, I have a
Hi, you can first use our to_homo api to convert the heterograph to a homo graph, then apply edge_softmax api to compute the softmax value (normalized by destination nodes) and convert the homo graph to a heterograph with our to_hetero api.
to_homo store nodes and edges in the same order as ntypes and etypes? The document does not say anything about the order. I ask this question because
to_homo does not copy node/edge features automatically. We need to know the order of nodes and edges in order to copy the features correctly.
And how can we copy the features from the homo graph back to the heterograph? The documentation of
to_hetero says "The returned node and edge types may not necessarily be in the same order as
etypes". Suppose we want to copy the node features, does that mean for each node type, we need to use
filter_nodes to get the node IDs, and then use the node IDs to retrieve the node features?
Or maybe we don’t need to do any conversion at all. We may use a function like torch_geometric.utils.softmax. Although we still need to group the values of
attn_before_softmax from all edges into a single tensor and then distribute the results.
Yes, by the current implementation. The type information is stored in the
dgl.ETYPE data field. The following demo-code shows how to copy the features across:
hg = ... # some heterograph g = dgl.to_homo(hg) g.ndata['h'] = th.zeros((g.number_of_nodes(), feat_size) for ntid, nty in enumerate(hg.ntypes): nid = (dgl.ndata[dgl.NTYPE] == ntid).nonzero().view(-1) # find the node id of the original heterograph orig_nid = dgl.ndata[dgl.NID][nid] # copy features g.ndata['h'][nid] = hg.nodes[nty].data['h'][orig_nid]
Note that the code does not rely on whether
g stores features in the same order of ntypes and etypes.
You can leverage the node/edge id mapping generated and stored in the ndata/edata.
g = dgl.graph(([0, 1, 2], [3, 4, 5])) # a bipartite graph stored as homograph g.ndata[dgl.NTYPE] = th.tensor([0, 0, 0, 1, 1, 1]) g.edata[dgl.ETYPE] = th.tensor([0, 0, 0]) g.ndata['feat'] = ... hg = dgl.to_hetero(g, ['user', 'item'], ['buy']) # copy node features hg.nodes['user'].data['feat'] = g.ndata['feat'][hg.nodes['user'].data[dgl.NID]] hg.nodes['item'].data['feat'] = g.ndata['feat'][hg.nodes['item'].data[dgl.NID]]
Thank you for your reply with code examples! I know how to use
to_hetero clearly now!
But before your reply, I found a solution using the PyTorch Scatter library. It does not require converting the graph and solves the problem perfectly. The idea is to use torch_scatter.composite.scatter_softmax to normalize the attention scores. The
index argument of
scatter_softmax can be obtained using dgl.DGLHeteroGraph.all_edges.
hg = ... # the hetero graph # assume the hg.etypes have the same dsttype, # otherwise need to do this separately for each dsttype. src =  index =  for etype in hg.etypes: attn = hg[etype].edata['attn_before_softmax'] src.append(attn) uid, vid = hg.all_edges(form='uv', order='eid', etype=etype) index.append(vid) src = th.cat(src, dim=0) index = th.cat(index, dim=0).to(src.device) a = scatter_softmax(src, index, dim=0)
Thanks for the code reference. I’m thinking about further improving DGL’s usability based on your example. One thing we could do is to add the
scatter_softmax operator into DGL if you found installing
torch_scatter is a little bit headache. Another direction is to automatically copy features when a heterograph is converted to a homo graph via
to_homo, something like the following:
hg = ... # the hetero graph # assume the hg.etypes have the same dsttype, g = dgl.to_homo(hg) # it automatically copies and concats the 'attn_before_softmax' features for all edges. attn_before = g.edata['attn_before_softmax'] a = dgl.edge_softmax(attn_before)
Would you like this new behavior?
My use case is to implement a model similar to Heterogeneous Graph Transformer. Currently, I use
scatter_softmax to normalize the attention scores and
scatter_add to aggregate the values. So I definitely support adding
scatter_softmax into DGL because I don’t have to install an additional library.
But I probably won’t use the
to_homo approach even if it automatically copies features. It is mainly due to the concern about efficiency. The steps specific to the
scatter_softmax approach is concatenating the attention scores and
vids. I don’t think these steps are more costly than converting the graph. Besides, storing features in the homo graph may consume more memory. For example,
hg has two kinds of nodes, namely user nodes and item nodes. Only the item nodes have the feature
to_homo automatically copies all features, the user nodes in the homo graph also have the feature
'x', consuming more memory.
I think the best solution is to introduce functions like
update_all that supports a heterograph with more than one edge type, so that we could do something like:
dgl.hetero_edge_softmax(hg, 'e', 'a') hg.hetero_update_all(fn.u_mul_e('v', 'a', 'm'), fn.sum('m', 'rst'))