Hi, I have a small question here.
Take dgl.ops.copy_e_max as an example.
g = dgl.graph(([0, 0, 0, 1, 1, 2], [0, 1, 2, 1, 2, 2]))
x = th.ones(3, 2, requires_grad=True)
out = F.copy_u_max(g, x)
out = out.sum()
out.backward()
print(x.grad)
# tensor([[1., 1.],
# [1., 1.],
# [1., 1.]])
We can see that in such situations like max(x1=3, x2=3) = 3, the gradient will only be added into one result like x1, or x2. But from my perspective, both x1 and x2 have contributions to the result, should both gain gradients. So I have this question. Hoping for your answer. Thanks.