I want to learn the implementation of DGL. Can anyone give me a guide? Please tell me the motivation of DGL and how to optimize it? THX
We will give a detail code walkthrough in the form of blog post probably this month. You could subscribe to our blogs for updates. In the time being, here is a brief summary of what is going on in source tree:
|-- conda # conda related install script |-- docker # docker script |-- docs # all the document codes using sphinx |-- examples |-- mxnet # mxnet examples |-- pytorch # pytorch examples |-- include # C++ lib headers |-- dgl |-- runtime # headers for CFFI solution |-- ... |-- graph.h # Graph data structure using adjlist |-- graph_interface.h # The base interface class |-- immutable_graph.h # graph data structure using CSR |-- graph_op.h # graph traversal, transformation, etc. |-- scheduler.h # C routines used by scheduler |-- python |-- dgl |-- _ffi # CFFI python side codes |-- backend # mxnet/pytorch specific backend codes |-- contrib # codes in the stage of contribution |-- data # ready-to-use dataset package such as CoraDataset |-- function # builtin message/reduce functions |-- nn # pre-defined GNN layers (e.g. GCNLayer) |-- runtime # IR and execution logic for message passing |-- base.py |-- batched_graph.py # for batching multiple graphs |-- frame.py # data structure for storing node/edge featuers |-- graph.py # DGLGraph (~= GraphIndex + Frame) |-- graph_index.py # graph structure class (no feature storage) |-- init.py # feature initializers |-- ndarray.py # internal ndarray wrapper used by DGL |-- propagate.py # propagation APIs (e.g. topo_nodes) |-- subgraph.py # subgraph data structure |-- transform.py # graph transformation APIs (e.g. line_graph) |-- traversal.py # graph traversal APIs. |-- udf.py # UDF related data structure (e.g. NodeBatch, EdgeBatch) |-- view.py # Graph views |-- src # C++ source codes |-- graph |-- runtime |-- scheduler |-- tests # unittests |-- third_party # all external dependencies |-- tutorials # python script for the tutorials on our doc site.
@minjie Could you elaborate on how the runtime and scheduling works? I have been looking through the code and its very clear and well commented, however I think the overall methodology is being lost on me. Could you explain what the IR stands for and what its purpose is? And could you perhaps give a quick list of the high level steps involved in taking a send or recv call and ultimately transforming it into kernels to run on the GPU?