Apply GCN on a batch of images

Hi, I tried to apply GCN on a batch of images whose graph structure is the same (let’s say fully-connected), but the result is bad. Could you guys point out what I have done wrong? Here is my implementation.

I created one dgl.graph object. The forward(graph,inputs) in the model takes a batch of images and a single graph object as input.

I’m aware that GraphConv() expects the input of dimension N,*,in_feats. I assume that the additional dimension is equivalent to the batch_size dimension. Hence, I reshape a batch of images. The code of GCN layer is as follow:

class GCN(nn.Module):
    def __init__(self, in_feats, hidden_size):
        super(GCN, self).__init__()
        self.conv1 = EdgeConv(in_feats,hidden_size)
        self.conv2 = EdgeConv(hidden_size,in_feats)

    def forward(self, inputs,g):
        b,c,h,w = inputs.shape
        inputs = inputs.view(b, c,-1) # B x C x N

        # Expected dimension for GCN: N,*,hidden_size
        inputs = inputs.permute(2,0,1) # N x B x C
        output = self.conv1(self.g, inputs)
        output = torch.rel(output)
        output = self.conv2(self.g, output)
        output = output.permute(1, 2, 0)
        output = output.view(b, c, h, w)
        return output

Thank you.

I’m not an expert in CV so maybe I’m wrong. The success of ConvNets in CV is largely attributed to its ability to learn hierarchical representations out of pixels, from the near to the distant. From that stand of point, it doesn’t seem to make sense to apply a two-layer GCN on a complete graph of pixels. Even if complete graphs work, 2 GNN layers will probably be far from enough.

@HieuPhan33 The input feature for DGL GraphConv is (N,Din) where Din is size of input feature as described in: https://docs.dgl.ai/api/python/nn.pytorch.html#graphconv
The correct way to handle the batch dimension is to construct a large graph containing some small graphs, each small graph represent a sample, and there’s no link between two small graphs. The batch API is: https://docs.dgl.ai/generated/dgl.batch.html?highlight=batch#dgl-batch

Also, as @mufeili described, the way CNN works is that it learns hierarchical feature representation while the code you’re sharing may not be able to achieve that. It would be great if you share more information about the goal you are building this model for. Thanks.

1 Like

Hi all, thanks for your answers. I use a standard auto-encoder (DeepLab) for image segmentation.
At the penultimate layer of the encoder, I applied GCN on extracted features by the encoder before feeding to the decoder. I created a single graph in the train.py and stored as an attribute of the DeepLab. Then when I called the model(batch_images), the model extracts features and applies GCN on the batch of extracted features based on the passed graph attribute.

def forward(self, input):
        # ======= Encoder ===========
        x = self.backbone(input) # b, 2048, h, w`
        # ======= GCN ============
        x = self.gcn(x,self.g)
        # ========Decoder =========
        x = self.decoder(x)
        # Final segmentation
        x = F.interpolate(x, size=input.size()[2:], mode='bilinear', align_corners=True) 
        return x

In the train.py:

    total_nodes = 256*256 # Size of extracted feature maps
    g = networkx.complete_graph(total_nodes).to_directed()
    model = DeepLab(graph=g)
    ....
    output = model(input)