Mask mechanism for GATConv with batchDGLGraph type data

Allen · December 14, 2019, 1:32pm

I have met a question:
Suppose I use a batch of data with torch.Tensor type which represents a set of batched nodes for a batch of graphs(size of data: batch_size* max_node_num * emb). Also the data is attached with a mask tensor due to the unequal number of nodes for different graphs in a batch. Currently I use graph attention network(GAT) to build my model. So how to intergrate mask tensor(batch_size * max_node_num) along with data tensor(batch_size* max_node_num * emb) into GATConv to adapt for a batch of data computation for a model?

Hope for the answer.
Thank you!

mufeili · December 14, 2019, 4:22pm

Any particular reason you want to pad the number of nodes in each graph to max_node_num? For normal tensor computation you need to perform zero padding but this is not required for DGL.

Allen · December 14, 2019, 5:56pm

Hi,
The question comes from my research(NLP). I need to improve the semantic parsing model with Graph Attention Network for some batches of sentences(with different lengths of sequence of words) with different graphs. Although I could do with ‘for’ loop to tranverse and concate, it would be time consuming in dealing with a large number of batches. I think this function could do efficiently in package(with some acceleration like CUDA c++ components in ‘fairseq’ by facebook AI). So for the cause in proposing this requirement.

mufeili · December 15, 2019, 5:30am

How about multiplying g.ndata['ft'] by mask before L113. Maybe you can elaborate more about the computation of your modeling.

Allen · December 16, 2019, 3:53am

Yes, however it works only for certain graph, not a batch of graphs. what I want to fit is a batch of graphs with different number of nodes.

mufeili · December 16, 2019, 3:54am

Sorry why does it only work for certain graph? I suppose you need to initialize the masks for all graphs anyway? Could you please elaborate more about the whole computation process?

Allen · December 16, 2019, 4:35am

As for the parameters in the forward module https://github.com/dmlc/dgl/blob/master/python/dgl/nn/pytorch/conv/gatconv.py#L113:

graph : DGLGraph
The graph.
feat : torch.Tensor
The input feature of shape :math:(N, D_{in}) where :math:D_{in}
is size of input feature, :math:N is the number of nodes.

The type of graph is DGLGraph, not BatchedDGLGraph. So the function is only fit for operating a graph for each time, not for a batch of graphs.

Yes, I do need to initialize the masks for a batch of graphs, but I could not operate the procedure(handle a minibatch operation of GATConv) without ‘for’ loop to tranverse the batch of graphs.

mufeili · December 16, 2019, 4:50am

There might be place to improve our documentation, but note that BatchedDGLGraph in fact inherits DGLGraph and nn modules are supposed to work with them. Let’s say for each graph in a list we have g.ndata['mask'] for a boolean mask indicating the existence of nodes. With dgl.batch(g_list), you will automatically get the masks concatenated, which can be used for masking non-existing nodes during the computation.