Problems in example pytorch.graphsage.train_sampling.py

  1. Why does the code transform DGLgraph into heterograph via g = dgl.as_heterograph(g)? What purpose does this transformation have?

  2. why does h_dst = h[:block.number_of_dst_nodes()] can get the dst nodes features in a bipartite-structured graph? So, in bipartite-structured graph, dst nodes index firstly?

For question 1, because the dataset is prepared as DGLGraph object, while the neighbor samplers only work on DGLHeteroGraph objects, and the two objects are different. It really is just a hack to make neighbor samplers work on DGLGraph, and it shouldn’t be necessary once we unify the two data structures (which is an ongoing refactor effort now).
For question 2, this property is ensured by to_block() method. When a graph is converted into a bipartite-structured graph using to_block, the destination nodes always index firstly (for every node type if you are dealing with heterogeneous graphs).

Please feel free to follow up. Thanks.

1 Like

It’s very helpful for me to understand and learn dgl frame, thanks very much!