I have built a similar graph classification model like this one :
https://docs.dgl.ai/en/0.4.x/tutorials/basics/4_batch.html
on my own graph dataset, the problem is that when i remove all those graphConv layers in the example above and only have a nn.linear layer, i get a good accuracy even after 100 epochs, but when i even add a single GraphConv layer, even when the input and output are the same length (so you would think it would at least learn the identity function…) then i cant even overfit on training samples, the loss goes a lot higher on training samples even after 5000 epochs! so even with 5000 epochs the loss on TRAINING samples is higher than when i only have a linear layer (i did 5000 epoch to make sure its not because it hasn’t learned the good parameters)
so why is that? also where is the message pasing part of GCN in their graph classification example? does the GraphConv take care of that ? if not then why didn’t they include it in their example?