Hi,
I’m trying to run papers100m on two machines but got stuck at the partition step. I’m wondering how big the cpu memory is needed for running partition_graph.py on ogb_papers100m. Currently, my cpu mem is 312GB. After running python3 partition_graph.py --dataset ogb-paper100M --num_parts 2 --balance_train --balance_edges
, it gave failed on writing metis...
and then Killed
for oom. Complete err msg as below:
load ogbn-papers100M
This will download 56.17GB. Will you proceed? (y/N)
y
Downloading http://snap.stanford.edu/ogb/data/nodeproppred/papers100M-bin.zip
Downloaded 56.17 GB: 100%|█████████████████████████████████████| 57519/57519 [18:25<00:00, 52.04it/s]
Extracting dataset/papers100M-bin.zip
Loading necessary files...
This might take a while.
Processing graphs...
100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 21399.51it/s]
Converting graphs into DGL objects...
100%|██████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.80s/it]
Saving...
finish loading ogbn-papers100M
finish constructing ogbn-papers100M
load ogb-paper100M takes 2013.006 seconds
|V|=111059956, |E|=1615685872
train: 1207179, valid: 125265, test: 214338
Converting to homogeneous graph takes 22.111s, peak mem: 190.091 GB
Convert a graph into a bidirected graph: 548.671 seconds, peak memory: 233.830 GB
Construct multi-constraint weights: 4.294 seconds, peak memory: 233.830 GB
Failed on writing metis2905.6
Failed on writing metis2905.7
Failed on writing metis2905.9
Failed on writing metis2905.11
Failed on writing metis2905.12
Failed on writing metis2905.13
Killed
Thanks in advance!