Creating DGL partition graph with halo nodes as 0

pranjaln · July 10, 2023, 7:41am

I have used the DGL Metis partitioner script mentioned here (https://docs.dgl.ai/en/0.8.x/tutorials/dist/1_node_classification.html#sphx-glr-tutorials-dist-1-node-classification-py) to partition the papers-100M graph into 4 partitions. I want to create subgraphs from the existing partitions with halo hops as 0 (by default, it is 1 in the partitioning script). The following function is something I have come up with.

def create_subgraph(rank: int) -> dgl.DGLGraph:
    """
    Creates a subgraph local to the worker consisting of only inner nodes
    :param rank: The rank for which the subgraph needs to be returned
    :return: dgl subgraph
    """
    print(f"Loading the subgraph for - {rank}")
    partition_tuple = dgl.distributed.load_partition(
        part_config='papers-4/ogbn-papers100M.json',
        part_id=rank,
        load_feats=True
    )
    g = dgl.node_subgraph(
        partition_tuple[0],
        partition_tuple[0].ndata['inner_node'] == 1
    )
    g.ndata['feat'] = partition_tuple[1]['_N/feat']
    g.ndata['label'] = partition_tuple[1]['_N/labels']
    g.ndata['train_mask'] = partition_tuple[1]['_N/train_mask']
    g.ndata['test_mask'] = partition_tuple[1]['_N/test_mask']
    g.ndata['val_mask'] = partition_tuple[1]['_N/val_mask']
    g = dgl.add_self_loop(g)

    return g

Is this the right way to go, or am I missing something? rank is the part of the graph that needs to be loaded.

minjie · July 12, 2023, 11:25am

Hi,

I think the easiest way is to use dgl.remove_nodes (which internally calls node_subgraph too). I think your code otherwise looks good to me. Did you encounter any errors?

pranjaln · July 12, 2023, 11:44am

@minjie I did not encounter errors, but the number of training nodes in DistDGL partitions and using this function are not consistent. But this shouldn’t be happening, right since I am only removing Halo nodes.

minjie · July 12, 2023, 12:42pm

node_subgraph will relabel node IDs. To make them consistent, you need to (1) create a train/val/test mask that has the same number of elements with the number of nodes of partition_tuple[0]; (2) attach the mask to the graph partition using g.ndata; (3) call node_subgraph which will handle the ID mapping automatically for you.

pranjaln · July 17, 2023, 11:30am

Hey @minjie
I ran a partitioning script to partition the graph ogb-papers100M into 4 parts -


import dgl
import torch as th
from ogb.nodeproppred import DglNodePropPredDataset
data = DglNodePropPredDataset(name='ogbn-papers100M')
graph, labels = data[0]
labels = labels[:, 0]
graph.ndata['labels'] = labels

splitted_idx = data.get_idx_split()
train_nid, val_nid, test_nid = splitted_idx['train'], splitted_idx['valid'], splitted_idx['test']
train_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
train_mask[train_nid] = True
val_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
val_mask[val_nid] = True
test_mask = th.zeros((graph.number_of_nodes(),), dtype=th.bool)
test_mask[test_nid] = True
graph.ndata['train_mask'] = train_mask
graph.ndata['val_mask'] = val_mask
graph.ndata['test_mask'] = test_mask

dgl.distributed.partition_graph(graph, graph_name='ogbn-papers100M', num_parts=4,
                                out_path='papers-4',
                                balance_ntypes=graph.ndata['train_mask'],
                                balance_edges=True,
                                num_hops=2)

I then try to load a partition using the code snippet -

partition_tuple = dgl.distributed.load_partition(part_config='papers-4/ogbn-papers100M.json', part_id=0, load_feats=True)

The partition_tuple[0] is the graph object and partition_tuple[1] consists of features and labels. However, it seems that partition_tuple[1] only contains the features from inner nodes and not of all nodes, as is evident from the output of the following code snippet.

print(partition_tuple[0].num_nodes(), partition_tuple[1]['_N/feat'].shape, torch.nonzero(partition_tuple[0].ndata['inner_node']).shape)
59883684 torch.Size([27230116, 128]) torch.Size([27230116, 1])

Is there a way to store the features and labels of all nodes (inner and HALO nodes) while partitioning?

minjie · July 20, 2023, 4:36am

Yes, each partition only stores the inner node features/labels by design because otherwise there will be a lot of data redundancy. I don’t think DGL currently supports keeping features/labels of HALO nodes. You may need to customize your own implementation.

system · August 19, 2023, 4:36am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.