About somethings like `pyg.data.separate.separate`

Currently, there are basically two methods to store numerous small graphs in a dataset with DGL. The first method is to store the graph structure and constructing a DGLGraph only in the dataset.__get_item__ method (e.g. QM9Dataset). The second method stores the graphs as a list within a member variable of the Dataset (e.g. ZINCDataset). The first method can be cumbersome, while the second may encounter memory issues. With something like separate, it will be possible to store a list of graphs as a batched graph and retrieve any of these graphs from the batched graph in a dataset without needing to unbatch it.

Does dgl.slice_batch work for you?

Thanks, it may work. By the way, may I ask what prevents DGL from supporting the saving and loading of batched graphs? It would simply be much faster.

Batched graph is a list of DGLGraphs and you may save them with dgl.save_graphs() together with additional info?

I mean saving batched_graphs=dgl.batch(graphs) and preserving the information for unbatching after loading. See Any example for save/load single graph and batched graph? · Issue #936 · dmlc/dgl · GitHub.

No such native support for batched graph for now. If you really want it, please file a feature request in DGL repo.

1 Like