Diffpool seems to be very memory inefficient? i run out of memory even with small batch sizes?

I am trying to use dgl’s diffpool for graph classification on my own dataset. i am using this implementation and only replacing the train.py with my own code :

but the problem is i can’t even finish one epoch without running out of VRAM, even tho i have 6 GB of VRAM! even setting the batch size to 1 or 2 didn’t help. this is the error :

/dgl/diffpool/model/dgl_layers/gnn.py", line 131, in forward
    current_lp_loss = torch.norm(adj.to_dense() -
RuntimeError: CUDA out of memory. Tried to allocate 1.54 GiB (GPU 0; 5.93 GiB total capacity; 3.55 GiB already allocated; 1.10 GiB free; 3.83 GiB reserved in total by PyTorch)

Is this normal? why is it running out of memory when the batch size is as small as 2 or 1?

I dont think there is a problem with my code because i have used the very same code (only a little different) on other architectures as well like GCN and others, and i never ran out of VRAM in this dataset even with batch_size > 64.

I am also using the default parameters :

                pool_ratio=0.15,
                num_pool=1,
                cuda=0,
                lr=1e-3,
                clip=2.0,
                batch_size=2,
                epoch=100,
                train_ratio=0.7,
                test_ratio=0.1,
                n_worker=1,
                gc_per_block=3,
                dropout=0.0,
                method='diffpool',
                bn=True,
                bias=True,
                save_dir="./model_param",
                load_epoch=-1,
                data_mode='default'

Hi,

This is expected, because diffpool involves lots of dense computation. The adjacency matrix after pooling is a dense matrix. If you batch 30 graphs about 300 nodes per graph, the new adj matrix would be about 10000*10000 and be dense, therefore it consumes lots of memory. Other GNN models are most sparse computation, which is highly optimized by dgl

But i can’t even use it even with 1 batch, there seems to be a problem because i ran out of memory after 120-130 epochs, and not at the start, which means the memory is not getting cleaned up or something, because after each batch i assume the VRAM should get freed or something, i am using the same training code that i have tried with other GNN models so i doubt there is a problem in my code.

Can you raise an issue at dgl’s github repo? It’s possible that the bug is inside dgl. Also did you do any modification on the original code?