FileNotFoundError: Cannot find DGL C++ graphbolt library in DGL 2.2.1 and pytorch 2.3.0

I want to run DGL on my linux server,and this is my conda environment:

cuda-cudart               12.1.105                      0    nvidia
cuda-cupti                12.1.105                      0    nvidia
cuda-libraries            12.1.0                        0    nvidia
cuda-nvrtc                12.1.105                      0    nvidia
cuda-nvtx                 12.1.105                      0    nvidia
cuda-opencl               12.4.127                      0    nvidia
cuda-runtime              12.1.0                        0    nvidia
cuda-version              11.8                 h70ddcb2_3    conda-forge
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
cudnn                     8.9.7.29             hbc23b4c_3    conda-forge
dgl                       2.2.1.th23.cu118         py311_0    dglteam/label/th23_cu118
pytorch                   2.3.0           cuda118_py311h6c9cb27_300    conda-forge
pytorch-cuda              12.1                 ha16c6d3_5    pytorch

(The whole conda environment is here.)

But when I import DGL it crashed with this:

ERROR: found no collectors for /home/jiaxuwu/IdeaProjects/torchhydro/tests/test_dgl_forward.py::test_train_graph

tests/test_dgl_forward.py:None (tests/test_dgl_forward.py)
test_dgl_forward.py:1: in <module>
    import dgl
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/__init__.py:16: in <module>
    from . import (
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/dataloading/__init__.py:13: in <module>
    from .dataloader import *
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/dataloading/dataloader.py:27: in <module>
    from ..distributed import DistGraph
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/distributed/__init__.py:5: in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/distributed/dist_graph.py:11: in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/graphbolt/__init__.py:36: in <module>
    load_graphbolt()
../../../.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/graphbolt/__init__.py:26: in load_graphbolt
    raise FileNotFoundError(
E   FileNotFoundError: Cannot find DGL C++ graphbolt library at /home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.post300.so

So are there problems in my environment? How to resolve?
Hope your reply.

The target so name is incorrect. post300 is not expected. How do you install pytorch and DGL? seems to be conda? could you try with pip for pytorch and DGL?

My server only allows us to use conda to manage our packages, so I use conda.
And even if I updated my pytorch and dgl to cuda 12.1 like below:

 conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

 added / updated specs:
    - pytorch
    - pytorch-cuda=12.1
    - torchaudio
    - torchvision


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    libtorch-2.3.0             |cuda118_h8db9d67_301       455.3 MB  conda-forge
    pytorch-2.3.0              |cuda118_py311h4ee7bbc_301        32.2 MB  conda-forge
    ------------------------------------------------------------
                                           Total:       487.4 MB

The following packages will be UPDATED:

  libtorch                       2.3.0-cuda118_hb1906df_300 --> 2.3.0-cuda118_h8db9d67_301 
  pytorch                   2.3.0-cuda118_py311h6c9cb27_300 --> 2.3.0-cuda118_py311h4ee7bbc_301 
________________________________________
conda uninstall dgl
conda install -c dglteam/label/th23_cu121 dgl

The problem still appears.

@Rhett-Ying

Could you try with previous torch version such as 2.2.0?

I came back.
I downgraded pytorch to 2.2.2 and run conda install -c dglteam/label/th22_cu121 dgl, this is my new conda environment.
However when I run my test function, it crashed like this:

ModuleNotFoundError: No module named 'torchdata'

(tests/test_dgl_forward.py:None (tests/test_dgl_forward.py)ImportError while impo - Pastebin.com)
I installed torchdata:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    portalocker-2.8.2          |  py311h38be061_1          42 KB  conda-forge
    torchdata-0.4.1            |     pyh8b8bddf_0          54 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          96 KB

The following NEW packages will be INSTALLED:

  portalocker        conda-forge/linux-64::portalocker-2.8.2-py311h38be061_1 
  torchdata          conda-forge/noarch::torchdata-0.4.1-pyh8b8bddf_0 

But it still crashed like this:

ImportError: cannot import name '_check_lambda_fn' from 'torch.utils.data.datapipes.utils.common' 

(tests/test_dgl_forward.py:None (tests/test_dgl_forward.py)ImportError while im - Pastebin.com)
So have you seen problem like this? What method do you think can help me to solve it?

The torchdata you installed seems to be too out-dated. What I am using is 0.7.1. could you install a newer one?

I updated torchdata to 0.7.1 and the problem disappaeared.
Thank you for much, because I think you (DGL team) still should find out why libgraphbolt_pytorch_2.3.0.post300.so will appear in your code with pytorch 2.3.

1 Like

the issue is tracked here.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.