Partition Graph with 0 HALO vertices

I need to create partitioned DGL graphs for a personal project. However, I do not need the HALO vertices included in the partitions. I am trying to partition the Products graph with num_hops=0. The script I am using for this is given here -

import dgl
import torch as th
from ogb.nodeproppred import DglNodePropPredDataset
data = DglNodePropPredDataset(name='ogbn-products')
graph, labels = data[0]
labels = labels[:, 0]
graph.ndata['labels'] = labels

splitted_idx = data.get_idx_split()
train_nid, val_nid, test_nid = splitted_idx['train'], splitted_idx['valid'], splitted_idx['test']
train_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
train_mask[train_nid] = True
val_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
val_mask[val_nid] = True
test_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
test_mask[test_nid] = True
graph.ndata['train_mask'] = train_mask
graph.ndata['val_mask'] = val_mask
graph.ndata['test_mask'] = test_mask

dgl.distributed.partition_graph(graph, graph_name='ogbn-products', num_parts=32, num_hops=0,
                                out_path='32part_data_products',
                                balance_ntypes=graph.ndata['train_mask'],
                                balance_edges=True)

However, I encounter the following error -

Traceback (most recent call last):
  File "partition-graph.py", line 22, in <module>
    dgl.distributed.partition_graph(graph, graph_name='ogbn-products', num_parts=32, num_hops=0,
  File "/data/pranjaln/anaconda3/envs/venv/lib/python3.8/site-packages/dgl/distributed/partition.py", line 997, in partition_graph
    assert val[-1] == g.num_edges(etype)
AssertionError

Is there a way to go about this error with num_hops=0? Or with num_hops=1, can we load back only the independent partition without the HALO nodes? Any help would be highly appreciated.

If no HALO is allowed for each partition, how could edges whose src does not belong to current partition be saved? For example, g has an edge 0 -> 1 and 1 is partitioned to part_0 while 0 is partitioned to part_1. Then where does the edge 0 -> 1 save to if HALO node is not allowed? If num_hops=1 is applied in this case, 0 -> 1 will be saved into part_0 as 1 is the inner node and 0 is the HALO node.

I think this is why you hit the assertion failure as some edges like above are missing.

@Rhett-Ying Thanks. I understand why the error is being raised now. The Products graph, however, is an undirected graph, and I would like to just ignore the cut edges. Is there a way around in the dgl partition_graph API to ignore the cut edges which, in turn, would take care of the num_hops=0 criteria that I have?

Hi @pranjaln

Currently dgl.partition_graph does not support num_hops=0. For an easy workaround, you can use dgl.subgraph (dgl.subgraph — DGL 1.1 documentation) to induce the subgraph of each partition without cutting edges.

you could utilize inner_node and inner_edge fields of partitioned graphs(g.ndata/g.edata) to remove the HALO nodes/edges.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.