Reproducibility issue

When running the common GCN model using DGL, I met the reproducibility issue, i.e. even I have tried my best to set seed everywhere, the result are still not inconsistency. And I found that pyg implement also has the same problem, sometime, the results accuracy may change upto the thousandth. I noticed that the scatter_add operator seems have reproducibility issue, which is common implement for spare mm or message passing over https://discuss.pytorch.org/t/possible-solution-to-the-reproducibility-issues-using-scatter-add-operation/48989. I wonder if there are any other problem which may cause this problem?And any solution? By the way, my dgl version is 0.7.2, pytorch version is 1.10.0, python version is 3.8.12. cuda version 11.3, and I run the model over GTX 3090. If I run the experiment over CPU, then everything will be fine.)

DGL doesn’t use scatter add, we use spmm instead which is deterministic.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.