In the source code, GATConv is implemented by using
e is the attention weight. NNConv also uses
e is the parameter
W. I found that there is a fundamental confilct between them, since you could only pass one
What I prefer:
g.update_all([fn.u_mul_e(), fn.u_mul_e()], fn.sum()) or
What do you think?