The whole distributed training works fine, but before the graph loading, I got lots of error on IP address.
What could be the cause? And if no easy solution, I am thinking of changing the code here (dgl/rpc_client.py at 195f99362d883f8b6d131b70a7868a537e55b786 · dmlc/dgl · GitHub), such that when this error happens many times, only print out error msg no more than once. Does this sound reasonable if I submit a PR for this?
Thanks.