cudaErrorCudartUnloading: CUDA: device-side assert triggered

Hi, the dgl team, now I am using dgl to do a graph classification task, when I use the dgl==0.6.1 with pytorch==1.7.1, the cuda version is 11.0, and add a code like torch.backends.cudnn.benchmark = False torch.backends.cudnn.deterministic = True, it would appear an error : /opt/dgl/src/runtime/cuda/cuda_device_api.cc:103: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: device-side assert triggered, but when these two lines of code are not used, the error is not shown, what is the reason? How to solve it? and I do not want to update the pytorch and dgl version.

Does anyone know how to solve it? I follow the issue mentioned in c10::CUDAError · Issue #67978 · pytorch/pytorch · GitHub, e.g. increase the num_works in dataloader, but it seems not to work.

DGL 0.6.1 is no longer supported (and I don’t think PyTorch 1.7.1 is supported as well). May I know why updating PyTorch and DGL is not a possibility?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.