Issue with dgl, torch 2.2 and cuda 118

Sorry to disturb you. I write because when I try to access the function dgl.DGLGraph.adj() I get the following error:

File “/home/rafael/.local/lib/python3.12/site-packages/dgl/heterograph.py”, line 3823, in adj from .sparse import spmatrix File “/home/rafael/.local/lib/python3.12/site-packages/dgl/sparse/init.py”, line 43, in load_dgl_sparse() File “/home/rafael/.local/lib/python3.12/site-packages/dgl/sparse/init.py”, line 40, in load_dgl_sparse raise ImportError(“Cannot load DGL C++ sparse library”) ImportError: Cannot load DGL C++ sparse library

I tried to reinstall other version but gives the same error always.

Sorry for the disturbance.

Solved I had to manually install the c++ files. Sorry for the time spent.

could you explain it more specific, i met this problem too.Thank you so much

Yes of course, in my case it was because it lacked c files that were inside the cudnn package in ubuntu. When I installed them it stopped giving that error. Also I reinstalled cusparse.

Another thing I tried was incorporate repositories of with previous packages of cuda like the ones from universe repositories.

I hope this helps, if not ask me again any time.

oh,thank you so much. and when i searched for this lacked c files,i could not find it. could you tell me what ways you find c files, do you find them in your computer or other ways

i changed torch version,and it did not occur this problem again,it just lacks some packings. i don’t know if this works .in this web site: Deep Graph Library (dgl.ai),it talks windows uses 2.0 2.1 torch version.so the following is my env:openhgnn2
torch 2.0.0+cu118
torchaudio 2.0.0+cu118
torchdata 0.6.0
torchvision 0.15.0+cu118
dgl 1.1.2+cu118

Yes so the missing c files are because some packages like cudnn and cu parse fails in the instalation if you include universe, to your repositories then it should be able to install those packages. (Note is not dgl but cuda issue in this case), what exactly did it tell you if I can get to know?

oh thank you. i think maybe it caused that dgl website said it did not keep windows version.maybe this is why i have this problem.

Ah it can be, I haven’t tried the installation on windows. Sorry for that, maybe I can take a look but I do not have too much idea to be honest.

oh, now i tryed to use linux,and it also occured this problem.Actually,i still do not know where i can load c++ file,and what specific name of this file? i can’t find it in website

And in this path: at /root/miniconda3/envs/opehgnn/lib/python3.11/site-packages/dgl/dgl_sparse/libdgl_sparse_pytorch_2.4.1so,it has this file

For me it did not work only including the cpp file because there were many dependencies. Is better is you find the package that contains it. For example c_sparse is in sudo apt get install cusparse. C_*** is normally in cu*** packages in dgl at least, from what I get to see.

Is not so much python dependency error I think

So it is because my envs installed cudnn cuSparse packages fail,so i need to install them again to solve this load problems.Is that right?

From my experience is what solved the issue. If you have further issues I can try to help you with them.

oh,thank you so much.and i search some information on website, i doubt that if i did not configure the path well?i don’t know ,maybe i can try your advice first

and i want to show you my envs and my operations.maybe it is a little long

(openhgnn) user@user-System-Product-Name:~$ pip list
Package Version


absl-py 2.1.0
alembic 1.14.0
annotated-types 0.7.0
Brotli 1.0.9
certifi 2024.8.30
charset-normalizer 3.3.2
colorama 0.4.6
colorlog 6.9.0
dgl 2.4.0+cu124
filelock 3.13.1
fsspec 2024.2.0
gmpy2 2.1.2
greenlet 3.1.1
grpcio 1.67.1
idna 3.7
Jinja2 3.1.4
joblib 1.4.2
littleutils 0.2.4
Mako 1.3.6
Markdown 3.7
MarkupSafe 2.1.3
mkl_fft 1.3.10
mkl_random 1.2.7
mkl-service 2.4.0
mpmath 1.3.0
networkx 3.2.1
numpy 1.26.3
nvidia-cublas-cu12 12.4.2.65
nvidia-cuda-cupti-cu12 12.4.99
nvidia-cuda-nvrtc-cu12 12.4.99
nvidia-cuda-runtime-cu12 12.4.99
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.0.44
nvidia-curand-cu12 10.3.5.119
nvidia-cusolver-cu12 11.6.0.99
nvidia-cusparse-cu12 12.3.0.142
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.4.99
ogb 1.3.6
openhgnn 0.7.0
optuna 4.0.0
outdated 0.2.2
packaging 24.1
pandas 2.2.3
pillow 10.2.0
pip 24.2
protobuf 5.28.3
psutil 5.9.0
pydantic 2.9.2
pydantic_core 2.23.4
PySocks 1.7.1
python-dateutil 2.9.0.post0
pytz 2024.2
PyYAML 6.0.2
requests 2.32.3
scikit-learn 1.5.2
scipy 1.13.1
setuptools 75.3.0
six 1.16.0
SQLAlchemy 2.0.36
sympy 1.13.2
tensorboard 2.18.0
tensorboard-data-server 0.7.2
threadpoolctl 3.5.0
torch 2.4.0+cu124
torchaudio 2.4.0+cu124
torchvision 0.19.0+cu124
tqdm 4.66.5
triton 3.0.0
typing_extensions 4.11.0
tzdata 2024.2
urllib3 2.2.3
Werkzeug 3.1.2
wheel 0.44.0

(openhgnn) user@user-System-Product-Name:~$ python
Python 3.11.10 (main, Oct 3 2024, 07:29:13) [GCC 11.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.cuda.is_available()
True

import dgl
from openhgnn import Experiment
Traceback (most recent call last):
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/dgl/sparse/init.py”, line 38, in load_dgl_sparse
torch.classes.load_library(path)
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/torch/_classes.py”, line 52, in load_library
torch.ops.load_library(path)
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/torch/_ops.py”, line 1295, in load_library
ctypes.CDLL(path)
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/ctypes/init.py”, line 376, in init
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libnvrtc.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “”, line 1, in
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/openhgnn/init.py”, line 2, in
from .dataset import *
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/openhgnn/dataset/init.py”, line 4, in
from .utils import load_acm, load_acm_raw, generate_random_hg
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/openhgnn/dataset/utils.py”, line 10, in
from dgl import sparse as dglsp
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/dgl/sparse/init.py”, line 43, in
load_dgl_sparse()
File “/home/user/anaconda3/envs/openhgnn/lib/python3.11/site-packages/dgl/sparse/init.py”, line 40, in load_dgl_sparse
raise ImportError(“Cannot load DGL C++ sparse library”)
ImportError: Cannot load DGL C++ sparse library

Lets try to update cusparse, is dgl also with cuda 12.4 I asume right?

yes,and i think the versions of env are adaptable

I think the same if continues we can try the conda build.

yes,last day i found in my envs ,it has installed some of cuda toolkit,but it did not full.just like this

nvidia-cublas-cu12 12.4.2.65
nvidia-cuda-cupti-cu12 12.4.99
nvidia-cuda-nvrtc-cu12 12.4.99
nvidia-cuda-runtime-cu12 12.4.99
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.0.44
nvidia-curand-cu12 10.3.5.119
nvidia-cusolver-cu12 11.6.0.99
nvidia-cusparse-cu12 12.3.0.142
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.4.99

and i have a question: in the virtual envs of linux, can it install cudatoolkit? Every env has its own cudatoolkit?