Hi, would you have any suggestions or references that I could check to transform the softmax to sigmoid on Heterogeneous Graph Network example?

I imagine I can:

- Use torch.sigmoid on the attention weights;

- Normalize all weights by the sum of weights;

- Multiply weights by incoming messages and sum it to the current node;

I am not sure how to do step 2. Any hints? Thanks!

```
[....]
sub_graph.apply_edges(fn.v_dot_u('q', 'k', 't'))
attn_score = sub_graph.edata.pop('t').sum(-1) * relation_pri / self.sqrt_dk
attn_score = edge_softmax(sub_graph, attn_score, norm_by='dst')
sub_graph.edata['t'] = attn_score.unsqueeze(-1)
G.multi_update_all({etype : (fn.u_mul_e('v_%d' % e_id, 't', 'm'), fn.sum('m', 't')) \
for etype, e_id in edge_dict.items()}, cross_reducer = 'mean')
```