Random walk impl with COO format (not CSR)?

erickim555 · April 27, 2022, 11:04pm

The current DGL random walk implementations require a CSR graph format. For my use case, the CSR format requires a much larger memory footprint than its COO format:
For 68B edges (5 edge types), DGL(COO) is 1099GB cpu memory, and DGL(CSR) is 2470GB cpu memory.

Do you think it’d be possible to implement a random walk implementation that uses the COO format and is also just as performant as the CSR-based implementation? The high CPU memory overhead of CSR is causing complications with serving.

Thanks!

BarclayII · April 30, 2022, 6:59am

Is the number 2470G only from CSR or from both CSR and COO? In general COO storage is O(2 * E) while CSR is O(2 * E + N). It shouldn’t differ that much. You could try directly creating a graph from CSR and see again.

In general I think it’s hard to implement a COO-based random walk because you need to find the successors of a vertex, unless the COO array is sorted (which is equivalent to a CSR).

erickim555 · May 4, 2022, 6:09pm

The 2470G is only from CSR (I had used graph = graph.formats(['csr']) to force CSR only). This CPU memory discrepancy is indeed strange, I need to investigate what’s going on there, as for our graphs num_edges >> num_nodes.

Thanks for the info!

BarclayII · May 9, 2022, 6:08am

One thing I realized recently is that our COO to CSR conversion is not optimal; while the output consumption is O(2 * E + N), the peak consumption is actually more than that, probably 4 * E, because we maintain some O(E) arrays for sorting.

You could directly construct a CSR graph by

g = dgl.graph(('csr', (indptr, indices, data)))

system · June 8, 2022, 6:09am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.