Parallel BFS/DFS transversal implementation?

swamidass · April 17, 2020, 7:36pm

The topological sort has a nice parallel implementation that does not require specifying a start node.

Is there plans to expand the BFS transversal to enable this type of parallelism? I would suggest:

Making the source argument optional, and if it is excluded, randomly choosing a start point from every connected component in the graph.
Enabling the source argument to take multiple IDs in a list, rather than just 1 node.

In this way, it would work much more like the topological transversal. Is this feasible?

The same sort of enhancements would make sense for the DFS.

minjie · April 23, 2020, 12:03pm

Thanks for the suggestion. They make total sense to me. Currenty, we are lacking manpower on optimizing these traversal functions, so we would be really appreciated if you’d like to contribute . Meanwhile, we are willing to learn what kind of roles do these traversal functions play in GNNs. We implemented them merely for TreeLSTM and Junction Tree VAE, but unable to find another example that desires them. Do you happen to know some work that can potentially drive us to optimize them more?

swamidass · April 23, 2020, 8:54pm

We have explored both BFS and DFS in long range propagation across undirected graphs, finding out that forward-backward-BFS propagation (WAVE) is pretty powerful. We applied them to several tasks ranging from small molecule property prediction, shortest paths in graphs, solving image mazes, representing quantum chemistry, and computing voltages on circuits.

All these tasks have the property of requiring long range information propagation that cannot be represented by local aggregation. There is good theoretical and empirical reasoning that indicates all-by-all interactions between nodes/edges in a graph can be captured by three forward-backward passes of an RNN like network across an undirected graph.

Right now, the work around I’m playing with is transforming a bi-drectional digraph (how you represent UDGs) into a DAG that approximately follows the BFS, and then applying the topological transversal. Conveniently, there is a “reverse” flag was can use for the backward pass.

The problem with this work around is that this is not the ideal way to handle cross edges. But I’m still unclear how on how cross edges are handled (How are Cross Edges handled in transversals?).

If we can get an efficient implementation of WAVE in DGL that would be valuable to us, and hopefully valuable to the larger community. WE can also provide some code to generate datasets relevant to benchmarking algorithms on UDG information propagation.

See these papers:

https://pubs.acs.org/doi/full/10.1021/acscentsci.7b00405

https://ieeexplore.ieee.org/document/8852455/

Deep learning long-range information in undirected graphs with wave networks

Graph algorithms are key tools in many fields of science and technology. Some of these algorithms depend on propagating information between distant nodes in a graph. Recently, there have been a number of deep learning architectures proposed to learn on undirected graphs. However, most of these architectures aggregate information in the local neighborhood of a node, and therefore they may not be capable of efficiently propagating long-range information. To solve this problem we examine a recently proposed architecture, wave, which propagates information back and forth across an undirected graph in waves of nonlinear computation. We compare wave to graph convolution, an architecture based on local aggregation, and find that wave learns three different graph-based tasks with greater efficiency and accuracy. These three tasks include (1) labeling a path connecting two nodes in a graph, (2) solving a maze presented as an image, and (3) computing voltages in a circuit. These tasks range from trivial to very difficult, but wave can extrapolate from small training examples to much larger testing examples. These results show that wave may be able to efficiently solve a wide range of tasks that require long-range information propagation across undirected graphs. An implementation of the wave network, and example code for the maze task are included in the tflon deep learning toolkit (https://bitbucket.org/mkmatlock/tflon).

With all that focused on DFS, I do think there may be places that DFS is useful. However, we opted away from it because, primarily, DFS has low intrinsic parallelism compared to BFS.

swamidass · April 23, 2020, 8:58pm

Right now, I want to see if we can get WAVE to work with what you currently have, even if it isn’t the most efficient. I don’t really understand how topological transversal achieves its parallelism yet, so I’m not sure I will be able to figure it out.

In situations like this, I’ve found that a collaborative effort works well. At the moment @mufeili has been helping us a bit. Hopefully we can figure out some solutions together.