Loss.backward() takes too much time

Hi,

I started using DGL for NLP research. But there was a problem using DGL.

When I train my model with my PC, loss.backward() takes too much time. Same code works well when I use Google Colab. I think GPU/CUDA operation on DGL is not working properly on my PC. Other model without DGLGraph works well on my PC.

My PC setting (not working)

GPU : RTX3090
CUDA version : 11.1
python=3.9
pytorch=1.9.1
dgl=0.7.1

Can I know where the problem occured?

Thanks

Hi,

How did you install the dgl, from source or pip? And how slow is it?

I installed it using pip.
When I use Google Colab, it takes a second to train 1 step but when I use a PC, it takes 30 minutes to train.

It’s weird. Are you using windows or linux on your PC? Does this only happen to backward computation, or also very slow for the forward computation?

I use linux on my PC. And this only happen to backward computation.

Were you able to reproduce this problem on other machines (other than your PC and Colab)? This indeed sounds weird.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.