KeyError error message in master head?

Have a model that works with 0.7 release, then switch to master head b/c we need distributed engative sampler feature. Now we run into this error message (on a cluster of two machines, each with one graph partition):

Not sure what this error message means. Since it worked on 0.7 release, likely a bug?

Any suggestion? Thanks a lot!

Could you describe how to reproduce this issue? Thanks!

what I did is similar to what is in example/pytorch/graphsage/experiment/dist_train.py.

I first partitioned graph into two partitions (using the default partition algorithm metis), then load them one to each machine. Two machines of distributed training. Given this error message, I believe the error happens when loading the graph partition, nothing else is happening yet.

This bug should be fixed by [Bugfix] Fix bug introduced by https://github.com/dmlc/dgl/pull/3131 by classicsong · Pull Request #3234 · dmlc/dgl · G

1 Like

Ah! Indeed. Thank you!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.