How to split data on a heterogeneous graph for multi-GPU training in DGL?

I’m working with a heterogeneous graph in DGL and planning to train my model using multi-GPU (distributed) training. I’m a bit stuck on how to properly split the data to support this setup.

Specifically, I’m looking for guidance on:

  • How to split node or edge data for heterogeneous graphs in a way that’s compatible with multi-GPU / multi-process training.
  • Whether there are any built-in DGL utilities for partitioning heterogeneous graphs across GPUs.
  • Best practices or examples for setting up distributed dataloaders with heterogeneous graphs.

Any tips, examples, or references would be super helpful. Thanks a lot!