I am very new to DGL. I am studying the underlying c++ source code but its hard for me to understand. I have some questions about very technical stuff like finding out the in-memory and disk structure of DLG graph format, cross partition node access etc. A similar question has been asked here but it doesn’t answer my question.
Can any of the authors help me understand:
- The concrete structure of dgl graph. does it store adjacency matrix?
- How nodes/edges/features are represented in memory and what is the data structure? Tensor?
- How the distributed tensors work? Is a distributed tensor spanned across all the worker machines? if so, how is the data divided among them and how is it accessed?
- In distributed mode, how are the halo nodes stored in a partition?
- How are the nodes corresponding to halo nodes accessed? For example, in partition 1 on machine 1, there is a halo node whose actual node is in partition 2 on machine 2. How is it accessed from machine 1?
- Is the dgl graph serialized in binary format to disk?
- Is the key/value store used as a parameter server?
- What data is stored in the key/value store and in what format?
Sorry for the long list of questions. Its spinning my head.