ModuleNotFoundError in DGL with PyTorch 2.2 on Python 3.12 Docker Image

Hello DGL Community,

I’ve been struggling with an issue for over a week now, and I’m hoping someone here can help. I’m building a Docker image using Python 3.12 as the base, and I’m using PyTorch 2.2 along with the corresponding version of DGL. However, I keep encountering the following error when trying to import DGL:

Traceback (most recent call last):
  File "/var/task/evaluate_model.py", line 10, in <module>
    import dgl
  File "/var/task/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/var/task/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/var/task/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/var/task/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/var/task/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/var/task/dgl/graphbolt/__init__.py", line 8, in <module>
    from .base import *
  File "/var/task/dgl/graphbolt/base.py", line 8, in <module>
    from torchdata.datapipes.iter import IterDataPipe
  File "/var/task/torchdata/datapipes/__init__.py", line 11, in <module>
    from . import iter, map, utils
  File "/var/task/torchdata/datapipes/iter/__init__.py", line 79, in <module>
    from torchdata.datapipes.iter.util.cacheholder import (
  File "/var/task/torchdata/datapipes/iter/util/cacheholder.py", line 24, in <module>
    from torch.utils._import_utils import dill_available
ModuleNotFoundError: No module named 'torch.utils._import_utils'

Here’s my requirements.txt file:

torch==2.2
dgl -f https://data.dgl.ai/wheels/torch-2.2/repo.html
pydantic
boto3==1.29.6
packaging==23.2
optuna==3.4.0
pandas==2.1.3
scikit-learn==1.3.2
SQLAlchemy==2.0.23
psycopg2-binary

The main issue is that I can’t upgrade PyTorch or DGL because DGL has a hard requirement for the GraphBolt library, which doesn’t seem to be available in versions compatible with anything beyond PyTorch 2.2.

I’ve tried numerous combinations of Python images, PyTorch versions, and DGL configurations, but I keep encountering this issue or similar dependency conflicts. It’s incredibly frustrating to spend so much time trying to get DGL to work in a Docker environment because of these dependency issues.

I would really appreciate any advice or solutions that anyone might have. Has anyone else encountered this, or does anyone know a workaround?

Thank you for your help.

Does it work with torch==2.3.1 and DGL 2.3? You should have reached out earlier, we could have provided help.

Installation instructions of DGL 2.3 with torch 2.3.1 is here: Deep Graph Library

If you have a CUDA enabled GPU, then you should make sure to install the CUDA version of DGL.

Hello, I had the same issue and when update to 2.3.0, I got the following error.
AttributeError: type object ‘torch._C.Tag’ has no attribute ‘pt2_compliant_tag’
File , line 48
45 return correct.item() * 1.0 / len(labels)
47 model = SAGE(in_feats=n_features, hid_feats=100, out_feats=n_labels)
—> 48 opt = torch.optim.Adam(model.parameters())
50 for epoch in range(10):
51 model.train()

Code below, any advise would be grateful!
%pip install torch==2.3.0

%pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html
import dgl

dataset = dgl.data.CiteseerGraphDataset()

graph = dataset[0]

import numpy as np

import torch

Contruct a two-layer GNN model

import dgl.nn as dglnn

import torch.nn as nn

import torch.nn.functional as F

class SAGE(nn.Module):

def __init__(self, in_feats, hid_feats, out_feats):

    super().__init__()

    self.conv1 = dglnn.SAGEConv(

        in_feats=in_feats, out_feats=hid_feats, aggregator_type='mean')

    self.conv2 = dglnn.SAGEConv(

        in_feats=hid_feats, out_feats=out_feats, aggregator_type='mean')

def forward(self, graph, inputs):

    # inputs are features of nodes

    h = self.conv1(graph, inputs)

    h = F.relu(h)

    h = self.conv2(graph, h)

    return h

node_features = graph.ndata[‘feat’]

node_labels = graph.ndata[‘label’]

train_mask = graph.ndata[‘train_mask’]

valid_mask = graph.ndata[‘val_mask’]

test_mask = graph.ndata[‘test_mask’]

n_features = node_features.shape[1]

n_labels = int(node_labels.max().item() + 1)

def evaluate(model, graph, features, labels, mask):

model.eval()

with torch.no_grad():

    logits = model(graph, features)

    logits = logits[mask]

    labels = labels[mask]

    _, indices = torch.max(logits, dim=1)

    correct = torch.sum(indices == labels)

    return correct.item() * 1.0 / len(labels)

model = SAGE(in_feats=n_features, hid_feats=100, out_feats=n_labels)

opt = torch.optim.Adam(model.parameters())

No, I have a different problem, this is the error:

################################################################################
WARNING!
The 'datapipes', 'dataloader2' modules are deprecated and will be removed in a
future torchdata release! Please see https://github.com/pytorch/data/issues/1196
to learn more and leave feedback.
################################################################################

  deprecation_warning()
Traceback (most recent call last):
  File "/var/task/evaluate_model.py", line 10, in <module>
    import dgl
  File "/var/task/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/var/task/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/var/task/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/var/task/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/var/task/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/var/task/dgl/graphbolt/__init__.py", line 55, in <module>
    load_graphbolt()
  File "/var/task/dgl/graphbolt/__init__.py", line 45, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /var/task/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.so

Looking at the libraries I can see that on the image is installed graphbolt up to pythorch 2.2, that’s the reason why I was using an earlier version.

How did you install DGL when you got this error?

If you have the CPU version of torch 2.3, then you need to use this link to install DGL: https://data.dgl.ai/wheels/torch-2.3/repo.html

What about GPU version? Thanks.

Both of the lines below are only for torch 2.3 versions.

pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html

For CUDA 12.1 and

pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html

for CUDA 11.8.

Thank you, then I got exactly the same error as the previous user posted.

I can further report that, after install, import dgl gets the following error:
ImportError: Cannot load Graphbolt C++ library
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-242fa303-4a84-4d3b-aec6-3a2e19f10b8c/lib/python3.10/site-packages/dgl/graphbolt/init.py:31, in load_graphbolt()
30 try:
—> 31 torch.classes.load_library(path)
32 except Exception: # pylint: disable=W0703

I have checked the torch and cuda version should match the installation command you recommended:
%pip install torch==2.3.0
%pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html

Any advise? Thank you.

are sure you’re isntalling cuda version of torch? could you try to install with pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121 and check with torch.cuda.is_available()?

After running:
%pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
%pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html
torch.cuda.is_available() still returns false somehow…

I just ran pip install -r requirements.txt, using the same requirements I posted originally but replacing 2.3 for pytorch and DGL

DGL 2.4 is out and it has support for torch 2.4.1. Would you like to try out this combination?

sure, if not too much trouble, what is the command to install dgl 2.4 if we use torch 2.4.1 and cuda 12.1?

CUDA: pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/cu124/repo.html

CPU: pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/repo.html

Thank you, could we use cu121 instead?

Yes. If you go to the DGL webpage to the getting started section, you can pick cuda version, torch version and conda/pip etc.

https://www.dgl.ai/pages/start.html

Thank you. I installed dgl 2.4 and torch 2.4 But

import dgl

still gives error:
FileNotFoundError: Cannot find DGL C++ graphbolt library at /local_disk0/.ephemeral_nfs/envs/pythonEnv-525aecaa-dec3-4609-a10d-dbd14eef7889/lib/python3.10/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.4.0.so

I downloaded the wheel and the file is there:

I recommend going to the installation directory and verifying if the .so files can be found there.