How does DGL save gpu memory?

  Hi, I am new to dgl and gnn.
  When I run the graphSAGE example on the Reddit dataset(my GPU is Tesla T4), I found that DGL can add all training sets for training or inferencing, while pyG will be OOM when the batch size reaches about 9000.
  I want to know what makes DGL memory use less than pyG?

ps: Is this benefit brought by ‘kernel fusion’?
  I find the blog:
  But I did not understand it clearly.

I got the answer from github, thanks yzh119’s reply. Because other people may have the same question, I copied the reply here:

Yep it’s because of kernel fusion.
You can simply understand it as we directly compute the result on destination nodes without copying node feature to edges (saving the gpu memory cost of #edges * D), which is required by most scatter-gather based frameworks such as pyg.

1 Like