Stochastic Training on Large Graphs on Tensorflow 2

Eng2019 · October 4, 2021, 3:13pm

Hi all,
I saw there is tutorial on stochastic training on large graphs on here and there are also GNN examples implemented on TF2 here. Is it possible to run the stochastic training / minibatching on TF2? Thanks

BarclayII · October 5, 2021, 12:11pm

It should be possible, but unfortunately we don’t have TF2 counterparts for NodeDataLoader and EdgeDataLoader because we are not quite familiar with how custom minibatching works in TF2. If you could give us a reference of how TF2 custom minibatching works then we can figure out how to do minibatch GNN training together.

Eng2019 · October 6, 2021, 11:18am

Hello!
On TF2 we can use a class called tf.data.Dataset to wrap our data and yield numpy iterator using a method called as_numpy_iterator (the examples are on the same page in the documentation).

Eng2019 · October 6, 2021, 11:29am

Also, regarding training with large graphs, can we construct the graph object in batches? My dataset contains multiple graphs and saved in TFRecord format. I will read it using above tf.data.Dataset class and yield NumPy iterator. The format of the TFRecords is that each row represents each graph. Each batch of iterator represents 1 row, or in other words, 1 graph. Since I guess we need to construct 1 DGLGraph object consisting of all graphs for train and another 1 DGLGraph object for test (with inductive learning setting), I guess I need to build the DGLGraph object in batches feeding in from the iterator. Does the problem explanation clear enough? Thanks a lot!

Rhett-Ying · October 8, 2021, 5:30am

dgl.batch — DGL 0.8 documentation could be used to batch graphs in DGL.

BarclayII · October 9, 2021, 3:11am

One question I have though is does as_numpy_iterator support yielding non-array elements? Because DGLGraphs are not numpy arrays or tensors; it’s more complicated than that.

Eng2019 · October 15, 2021, 12:30pm

What do you mean by non-array elements?

BarclayII · October 18, 2021, 6:08am

Like some object that is not numpy.ndarray, tensors, or stuff like that.

system · November 17, 2021, 6:09am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.