Dgl/src/runtime/cuda/cuda_device_api.cc:103: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading

joseph · November 18, 2021, 2:01am

While training my GNN model I keep getting errors like this:

I am using cuda 11.1, pytorch 1.9.0.

Anyone has the similar issue? How to solve it?

Rhett-Ying · November 18, 2021, 2:08am

Hi, is the cuda version when install dgl is 11.1 too? match pytorch you installed?

joseph · November 18, 2021, 10:09am

Yeah. Cuda version 11.1 when installing dgl. they match

Rhett-Ying · November 18, 2021, 10:16am

are you training in distributed mode? could you provide a demo which could reproduce this issue? or more details such as callstacks. I cannot find more clues according to the screenshot.

system · December 18, 2021, 10:16am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.