Multi-GPU support and CommNet in DGL

chenxuhao · March 15, 2020, 10:46pm

Hi all,

I notice that GCN, GraphSAGE and GAT are supported with multi-GPU in DGL:

Are there other GNNs supported with multi-GPU? For example, GIN, Gated-GNN and so on.

Besides, is CommNet[1] implemented in DGL?

[1] Sainbayar Sukhbaatar, arthur szlam, and Rob Fergus. 2016. Learning Multiagent Communication with Backpropagation. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 2244–2252. http://papers.nips.cc/paper/6398-learning-multiagent-communication-with-backpropagation.pdf

Thank you.

Xuhao Chen
cxh@utexas.edu

zihao · March 16, 2020, 5:35am

DGL’s multi-GPU support depends on the backend framework (e.g. MXNet or PyTorch’s distributed support), so theoretically we can support all GNNs with multi-GPU.

The implementation depends on your problem setting, for example, if you are dealing with a giant graph with billions of edges, you must partition the graph first (e.g. METIS partition) or use some sort of sampling algorithm, the GCN/GraphSAGE/GAT example we provide belongs to this case.

If you are dealing with many small graphs, you just need to partition the dataset (for example, there are 40000 graphs and you partition them into 8 groups, where each group has a size 5000) and let each process(GPU) handle one group, no sampling/graph partition is required. You can refer to our transformer example. I think GGNN and the CommNet you mentioned belongs to this case.

As for CommNet, we have not supported it yet, and we welcome contributions from the community!