GATv2 implementation

Hi all,

I was wondering if anyone already implemented - or looked into - GATv2? It’s proposed in this paper: [2105.14491] How Attentive are Graph Attention Networks?.
This is basically the only difference to normal GATconv:

I have some difficulties with implementing it in a handy way as the DGL version of GATconv decomposes the weight vector into a_l and a_r.
Any suggestions? :slight_smile:

Kind regards,

Hi Erik,

The way of implementing GATv2 will be similar to GAT.

  1. Decompose W\cdot [h_i||h_j] into W_l \cdot h_i + W_r \cdot h_j. This is very similar to the a_l/a_r trick in the original GAT.
  2. After step one, the result should be an edge data. You can then directly apply LeakyReLU and a^\intercal \cdot () on it. The later one is another linear transformation with output dim being one (a scalar).