When i try to modify the num_partition
and batch_size
in ClusterGCNSampler, there will be an error sometimes when metis partition.
num_partitions = 1500
sampler = dgl.dataloading.ClusterGCNSampler(
graph,
num_partitions,
prefetch_ndata=[“feat”, “label”, “train_mask”, “val_mask”, “test_mask”],
)
dataloader = dgl.dataloading.DataLoader(
graph,
torch.arange(num_partitions).to(“cuda”),
sampler,
device=“cuda”,
batch_size=1,
shuffle=True,
drop_last=False,
num_workers=0,
use_uva=True,
)
[07:15:21] /opt/dgl/src/graph/transform/metis_partition_hetero.cc:89: Partition a graph with 232965 nodes and 114615892 edges into 1500 parts and get 0 edge cuts
test.py 18
sampler = dgl.dataloading.ClusterGCNSampler(
cluster_gcn.py 101 init
partition_ids = metis_partition_assignment(
partition.py 385 metis_partition_assignment
node_part = _CAPI_DGLMetisPartition_Hetero(
function.pxi 295 dgl._ffi._cy3.core.FunctionBase.call
function.pxi 241 dgl._ffi._cy3.core.FuncCall
dgl._ffi.base.DGLError:
[07:15:21] /opt/dgl/src/graph/transform/metis_partition_hetero.cc:106: Error in Metis partitioning: other errors
Stack trace:
[bt] (0) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x67) [0x7f32078c9a07]
[bt] (1) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(dgl::transform::MetisPartition(std::shared_ptrdgl::UnitGraph, int, dgl::runtime::NDArray, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool)+0x592) [0x7f3207e40c02]
[bt] (2) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(+0x9371cd) [0x7f3207e421cd]
[bt] (3) /opt/anaconda3/lib/python3.11/site-packages/dgl/libdgl.so(DGLFuncCall+0x4c) [0x7f3207d3068c]
[bt] (4) /opt/anaconda3/lib/python3.11/site-packages/dgl/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so(+0x1ba94) [0x7f31f63fea94]
[bt] (5) /opt/anaconda3/lib/python3.11/site-packages/dgl/_ffi/_cy3/core.cpython-311-x86_64-linux-gnu.so(+0x1bdff) [0x7f31f63fedff]
[bt] (6) python(_PyObject_MakeTpCall+0x254) [0x502d54]
[bt] (7) python(_PyEval_EvalFrameDefault+0x755) [0x50f025]
[bt] (8) python(_PyFunction_Vectorcall+0x173) [0x535103]
I am not familiar with DGL, thank you for your kind advice!