Regularization on GAT multiple heads

Hi there,

Wondering how we can add regularization on weights of multiple heads in GAT in DGL as in this paper: Multi-Head Attention with Disagreement Regularization (



I assume you are referring to the regularizations proposed in section 3.2.

Disagreement on Subspaces, Disagreement on Attended Positions, and Disagreement on Outputs are separately applied on old node representations, edge attentions and new node representations.

I think all these regularizations can be directly performed as dense matrix multiplications. You can compute node representations/edge attentions as in our DGL GAT example and then take these representations/attentions for regularization computation as in the case of no DGL.

Thanks for sharing. zishuzishu