Error in Metis partitioning: other errors

When i try to modify the num_partition and batch_size in ClusterGCNSampler, there will be an error sometimes when metis partition.
num_partitions = 1500
sampler = dgl.dataloading.ClusterGCNSampler(
graph,
num_partitions,
prefetch_ndata=[“feat”, “label”, “train_mask”, “val_mask”, “test_mask”],
)

dataloader = dgl.dataloading.DataLoader(
graph,
torch.arange(num_partitions).to(“cuda”),
sampler,
device=“cuda”,
batch_size=1,
shuffle=True,
drop_last=False,
num_workers=0,
use_uva=True,
)
[07:15:21] /opt/dgl/src/graph/transform/metis_partition_hetero.cc:89: Partition a graph with 232965 nodes and 114615892 edges into 1500 parts and get 0 edge cuts


test.py 18
sampler = dgl.dataloading.ClusterGCNSampler(

cluster_gcn.py 101 init
partition_ids = metis_partition_assignment(

partition.py 385 metis_partition_assignment
node_part = _CAPI_DGLMetisPartition_Hetero(

function.pxi 295 dgl._ffi._cy3.core.FunctionBase.call

function.pxi 241 dgl._ffi._cy3.core.FuncCall

dgl._ffi.base.DGLError:
[07:15:21] /opt/dgl/src/graph/transform/metis_partition_hetero.cc:106: Error in Metis partitioning: other errors
Stack trace:
[bt] (0) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x67) [0x7f32078c9a07]
[bt] (1) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(dgl::transform::MetisPartition(std::shared_ptrdgl::UnitGraph, int, dgl::runtime::NDArray, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool)+0x592) [0x7f3207e40c02]
[bt] (2) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(+0x9371cd) [0x7f3207e421cd]
[bt] (3) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(DGLFuncCall+0x4c) [0x7f3207d3068c]
[bt] (4) /opt/anaconda3/lib/python3.11/site-packages/dgl/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so(+0x1ba94) [0x7f31f63fea94]
[bt] (5) /opt/anaconda3/lib/python3.11/site-packages/dgl/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so(+0x1bdff) [0x7f31f63fedff]
[bt] (6) python(_PyObject_MakeTpCall+0x254) [0x502d54]
[bt] (7) python(_PyEval_EvalFrameDefault+0x755) [0x50f025]
[bt] (8) python(_PyFunction_Vectorcall+0x173) [0x535103]
I am not familiar with DGL, thank you for your kind advice!

and when i move the code to another file, the error will go away and I wonder why.

what does it mean? could you provide more details? with same configuration and arguments?

yes, just like rename it

just rename test.py to a new name like test2.py? Did you change the work directory? or is there any module files exist in the same directory? Or could you please share the whole file?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.