DGL Graph Data Structure

Hi

I am very new to DGL. I am studying the underlying c++ source code but its hard for me to understand. I have some questions about very technical stuff like finding out the in-memory and disk structure of DLG graph format, cross partition node access etc. A similar question has been asked here but it doesn’t answer my question.

Can any of the authors help me understand:

  • The concrete structure of dgl graph. does it store adjacency matrix?
  • How nodes/edges/features are represented in memory and what is the data structure? Tensor?
  • How the distributed tensors work? Is a distributed tensor spanned across all the worker machines? if so, how is the data divided among them and how is it accessed?
  • In distributed mode, how are the halo nodes stored in a partition?
  • How are the nodes corresponding to halo nodes accessed? For example, in partition 1 on machine 1, there is a halo node whose actual node is in partition 2 on machine 2. How is it accessed from machine 1?
  • Is the dgl graph serialized in binary format to disk?
  • Is the key/value store used as a parameter server?
  • What data is stored in the key/value store and in what format?

Sorry for the long list of questions. Its spinning my head.

Yes. coo/csc/csr are created if necessary.

zerocopy is achieved via NDArray which points to the tensor memory.

DistTensor relies on client/server framework, kv store and Partition policy. It’s distributed across machines. Clients access via shared memory or pull from remote servers. See more details in dist_tensor.py.

stored in DGLGraph and flags like inner_node are used to distinguish inner node and halo nodes.

send requests to remote server and fetch response.
pls refer to dgl/graph_services.py at 97b2ab53e27c6ba864312040d1fc9c78a49dac7d · dmlc/dgl · GitHub for more details.

Yes. load/save functions are defined for serialize. for example: dgl/unit_graph.cc at 97b2ab53e27c6ba864312040d1fc9c78a49dac7d · dmlc/dgl · GitHub

client/server

graph data, dist tensor. key-value format which are string and tensor data respectively.

Thank you @Rhett-Ying for the detailed explanation and much appreciated.