Error when running with multiple GPUs (single node)

Hi DGL community,

Recently I was trying to run distributed training (dgl/ at master · dmlc/dgl · GitHub) on a 4-GPU machine.

When I run with 1 GPU and 2 GPUs, it worked fine. However, when I try 4 GPUs, it has the following error:
“dgl._ffi.base.DGLError: Cannot assign node feature “h” on device cuda:1 to a graph on device cuda:0. Call to copy the graph to the same device.”

Can someone please advise on what might be the reason?

All the best,


What’s your detailed configurations? How did you launch the job, is it by our

Thank you for your reply!

Yes, I run it by on AWS g4dn.12xlarge instance.

OS: linux
DGL installed from conda

–num_trainers 4
–num_samplers 4
–num_servers 1 \