Doubt regarding computation graph

pranjaln · January 13, 2024, 7:20pm

import dgl
import torch
u, v = torch.tensor([0, 0, 0, 1]), torch.tensor([1, 2, 3, 3])
graph = dgl.graph((u, v))
sampler = dgl.dataloading.NeighborSampler([-1, -1])
dataloader = dgl.dataloading.DataLoader(
    graph, [3], sampler,
    batch_size=1,
    shuffle=False,
    drop_last=False)
for input_nodes, output_nodes, blocks in dataloader:
    print(blocks)
    print(blocks[0].edges(), blocks[1].edges())

I have the aforementioned script which creates a DGL graph (as shown here) as shown in the Fig
.

I assume the training node as node ID 3 and try to build a computation graph for a 2-layered GNN. The computation graph should have 1 node at Level 0 (node 3), 2 nodes at Level 1 (nodes 0 and 1) and
1 node at Level 2 (node 0). I print the MFGs generated using the aforementioned script. The output is -

[Block(num_src_nodes=3, num_dst_nodes=3, num_edges=3), Block(num_src_nodes=3, num_dst_nodes=1, num_edges=2)]
(tensor([1, 2, 2]), tensor([0, 0, 1])) (tensor([1, 2]), tensor([0, 0]))

I have the following questions -

Is my understanding of the computation graph correct?
If so, the MFGs generated do not match the computation graph at all.
The second MFG has 1 dst node and 3 src nodes, but only 2 edges. How is that possible?

BearBiscuit05 · January 14, 2024, 1:51am

Hi, ~~your understanding of the computation graph is correct~~, but DGL makes some transformations on it. You can consider your graph as a DGLGraph, but after sampling, it transforms into a dgl.block. It will also reindex the nodes after sampling, which is why it may look a bit different. Regarding the third question, in the block, node 0 (the node with the reindex ID before as node 3) serves as both the src and dst nodes. This setup is for message passing. Therefore, src nodes are [0,1,2] (before reindex [3,1,0]), and ‘dst’ nodes are [0] (before reindex [3]). You can also use dgl.NID to check.

pranjaln · January 14, 2024, 4:59am

Thanks for clarifying. I understand that DGL changes the computation graph into bipartite graph blocks. I still don’t understand the following -

What is the requirement of including the source node in the destination nodes as well? As far as I understand, if I wanted to include the message of the source node in the embedding calculation, I could add a self-edge. Then why include the source node in the dest nodes?
Assuming what you said, are there going to be dangling nodes (nodes without any edges) in the blocks?
I did use dgl.NID to see the node IDs of the 2 MFGs for the above example. The src and dst for the first MFG is

src = tensor([3, 0, 1])
dst = tensor([3, 0, 1])

and for the second MFG is

src = tensor([3])
dst = tensor([3, 0, 1])

What is unclear to me is that the first MFG should have only 1 edge and consequently, only 1 node (node 0). However, there are 3 edges in this case. I don’t understand how this could be.

Again, thank you for answering.

BearBiscuit05 · January 14, 2024, 6:42am

Yes, I think there were some inaccuracies in my previous response. The vertices already present in the computation graph still undergo neighbor sampling at each hop. However, since you have set it for full neighborhood sampling, no new nodes/edges are added. I believe you can learn about how blocks are constructed through this link. Each block takes all the current src nodes as the dst nodes for the next block, and the next round of sampling is based on this. I apologize for any misunderstanding caused by my previous answer.

import dgl
import torch
u, v = torch.tensor([0, 0, 0, 1]), torch.tensor([1, 2, 3, 3])
graph = dgl.graph((u, v))
sampler = dgl.dataloading.NeighborSampler([1, 1])
dataloader = dgl.dataloading.DataLoader(
    graph, [3], sampler,
    batch_size=1,
    shuffle=False,
    drop_last=False)
for input_nodes, output_nodes, blocks in dataloader:
    print(blocks)
    print(blocks[0].srcdata[dgl.NID], blocks[0].dstdata[dgl.NID])
    print(blocks[1].srcdata[dgl.NID], blocks[1].dstdata[dgl.NID])

You can try this code multiple times, and it will occur that if node 3 is sampled to node 0 in the first sampling, in the second sampling, node 1 can still be sampled by node 3.

frozenbugs · January 18, 2024, 2:14am

BTW, we have a doc to explain how MFG works, Introduction of Neighbor Sampling for GNN Training — DGL 1.1.3 documentation, PTAL as well, hope it helps.

pranjaln · January 18, 2024, 6:57am

Thank a lot @BearBiscuit05 and @frozenbugs
It is finally clear to me how the MFGs are created.

pranjaln · February 3, 2024, 7:35pm

Hey @BearBiscuit05 @frozenbugs
Just a follow-up.
From my understanding, blocks[0].edges() gives the edges from the outermost hop to the penultimate hop. However, the edges are represented using relabelled nodes from 0 - number of nodes in that MFG - 1. Is there a way to get the original IDs of these relabelled nodes back? Is blocks[0].srcdata[dgl.NID] a reliable way to get this mapping?

Rhett-Ying · February 4, 2024, 2:11am

Yes. (minimum 20 characters)

system · March 5, 2024, 2:11am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.