Best way to send (batched) graphs to GPU

If I want to put a graph on a GPU (say “cuda:0”), how can I do that? I’m looking for the equivalent of a torch.tensor.to(“cuda:0”) operation. I would like to do this because I have a large dataset of small graphs and I’m loading them with a DataLoader and batching them together as shown in the " Batched Graph Classification With DGL" tutorial. I do not want to put the entire dataset on the GPU (because it’s too big), but I do want to send the graph batches to the GPU when I am doing training and inference.

Any help would be appreciated.

1 Like

you can send batch graph at the beginning of the forward part, such as h = bg.ndata[‘h’]; h = h.to(device)

Thanks@taiky for the answer. Currently, you need to move each features one-by-one. I’m thinking about providing a helping function like:

g = ...
g.to(device)  # move all the feature data to device

What do you think?

3 Likes

Thanks taiky and minjie!
Yes, I think such a helper function would be useful. I’m a dgl and pytorch noob so it’s not clear to me which parts of a graph g need to be on the GPU for the model to perform calculations on it. Is it just g.ndata and g.edata?

Only g.ndata and g.edata. Where is the graph stored is currently maintained by DGL. It is born in CPU, but will cache to GPU if required. We also invalidate the cache if the graph is mutated.

Hi minjie!
If I send all of my g.edata and g.ndata to the device before running model.forward(), I notice that g.in_degrees().view(-1,1).float() is still on my CPU. I could always send it to GPU with .to(“cuda:0”), but that would require me to pass the device (“cuda:0”) as an argument to my forward() function. Is there any nice way to get around this?

Hi,

How about .to(input_tensor.device) ?

In case it’s helpful, here’s what I did:

def send_graph_to_device(g, device):
    # nodes
    labels = g.node_attr_schemes()
    for l in labels.keys():
        g.ndata[l] = g.ndata.pop(l).to(DEVICE, non_blocking=True)
    
    # edges
    labels = g.edge_attr_schemes()
    for l in labels.keys():
        g.edata[l] = g.edata.pop(l).to(DEVICE, non_blocking=True)
return g
2 Likes

Is there an update to this? Any official helper function?

Hi @adamoyoung, have you eventually figured out a solution to this issue? The code works for me on m own dataset when I run on CPU. However, when I load the model and the graphs to the GPU, I always get this error:

AttributeError: ‘NoneType’ object has no attribute ‘in_degrees’

With the latest version of DGL, you should now be able to send a DGLGraph to gpu with g.to(torch.device('cuda:0')).

It looks like you are calling in_degrees on an NoneType object. One possibility is that previously g.to(device) does not return the graph and the operation is performed in-place. If this is the case, you can try installing the latest version from source, which should solve the problem.

Hi, I also meet this problem.

My DGL is 0.4.1. Do I need to update to 0.5?

Either you can change

G = G.to(th.device('cuda:1'))

to

G.to(th.device('cuda:1'))

, or you can install from source the latest version.

Or install nightly build by pip install --pre dgl or pip install --pre dgl-cu90 or any cuda version

Try installing the dgl version compatible with your cuda version. If you have CUDA 10.1 and want to install using pip, do pip install --pre dgl-cu101.