How to replace softmax with sigmoid?

Hi, would you have any suggestions or references that I could check to transform the softmax to sigmoid on Heterogeneous Graph Network example?

I imagine I can:

  1. Use torch.sigmoid on the attention weights;
  1. Normalize all weights by the sum of weights;
  1. Multiply weights by incoming messages and sum it to the current node;

I am not sure how to do step 2. Any hints? Thanks!

   [....]
   sub_graph.apply_edges(fn.v_dot_u('q', 'k', 't'))
   attn_score = sub_graph.edata.pop('t').sum(-1) * relation_pri / self.sqrt_dk
   attn_score = edge_softmax(sub_graph, attn_score, norm_by='dst')

   sub_graph.edata['t'] = attn_score.unsqueeze(-1)

G.multi_update_all({etype : (fn.u_mul_e('v_%d' % e_id, 't', 'm'), fn.sum('m', 't')) \
                                for etype, e_id in edge_dict.items()}, cross_reducer = 'mean')

You can use reduce sum to sum the weights. Something like g.update_all(fn.copy_e('sigmoid_value', 'e'), fn.sum('e', 'h'))

@VoVAllen, I tried to implement it, but the time increased 3x with a custom function. Could you help me make it more efficient, if you have the time?

I guess that what is really in my way is that I need to divide by the sum of attentions per etype before I take the mean of all etypes in the cross_reducer.

original code:

G.multi_update_all({etype : (fn.u_mul_e('v_%d' % e_id, 'att', 'm'), fn.sum('m', 't')) \
                                for etype, e_id in edge_dict.items()}, cross_reducer = 'mean')

Changed code:

def mfunc(self, edges, eid_key):
      return {'m' : edges.src[eid_key]*edges.data['att'], 'att' : edges.data['att']}

def rfunc(self, nodes):
      m = nodes.mailbox['m']
      att = nodes.mailbox['att']
      t =  m.sum(1) / (att.sum(1)+1e-3)  #normalize by the sum of all attentions
      return {'t': t}

G.multi_update_all({etype : (partial(self.mfunc, eid_key='v_%d' % e_id), self.rfunc) \
                             for etype, e_id in edge_dict.items()}, cross_reducer = 'mean')

Is there a better way of doing it inside of G.multi_update_all? Or should I use some sort of for loop over the edge types and use subgraph.update_all?

t = m.sum(1) / (att.sum(1)+1e-3)
This can be done by

def efunc(edges):
   return {'norm_e': edges.data['att']/edge.dst['reduce_sum_norm']}
 g.update_all(fn.copy_e('sigmoid_value', 'e'), fn.sum('e', 'reduce_sum_norm'))
g.apply_edges(efunc)

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.