Adding BiLSTM followed by an attention layer in DGL

Can anyone guide me?

class GraphSage_BiLSTM_GAT(nn.Module):
def init(self, nfeat, nhid, nclass, dropout):
super(GraphSage_BiLSTM_GAT, self).init()

    self.dropout = dropout


    self.conv1 = SAGEConv(nfeat, nhid, aggregator_type='mean')
    self.conv2 = SAGEConv(nhid, nhid, aggregator_type='mean')
    self.conv3 = GATConv(nhid, nhid, num_heads=16)

    self.LS_end = nn.LSTM(input_size=nhid, hidden_size=nclass, num_layers=8, dropout=dropout, batch_first=True,
                          bidirectional=True)


    # self.conv3 = SAGEConv(nclass, nclass, aggregator_type='mean')
    self.conv4 = GATConv(nclass, nclass, num_heads=16)
def forward(self, x, adj):


    x = F.relu(self.conv1(adj, x))  # different from self-defined gcn
    x = F.dropout(x, self.dropout, training=self.training)
    x = self.conv2(adj, x)
    .............
    .............
    return F.log_softmax(x, dim=1)

How to add the bilstm layer initialized in the constructor inside the forward function? Later I want to add an attention layer too. I am not sure about the lstm operation and its purpose here. Can I change the size of node embedding while giving input to the first graphsage layer? Can someone please explain?

Why do you want to use BiLSTM? Are you following a particular paper?

Hello Mufei,

I was trying different combination of gnn layers if I could improve the performance. I know the basic reason of using bilstm to capture prediction from future in nlp, but not quite sure why bilstm is used in graph. I am Following paper:
Interpretable clustering on dynamic graphs with recurrent graph neural networks

Github: GitHub - InterpretableClustering/InterpretableClustering

Even though I am passing about two month with this paper, I have several confusions as I have some basic misconception on how gnn works.

This paper deals with dynamic graphs, whose structure evolves over time. If your dataset does not fall into this domain, then using BiLSTM may not be helpful.

Other than the DBLPE, other 4 datasets all have dynamic features evolving over time, however the structure and labels remains the same. I am working with the other four real datasets if I can make any better combination of gnn model.

One possible solution is to apply a BiLSTM to the dynamic features over time. However, I’m not sure if it will cause data leakage issues if future features are visible to the BiLSTM.

Yes, in the four real datasets only feature is dynamic, not the structure. However, class membership does not change over time. Then I don’t know what’s the point of classifying nodes timewise as they don’t change labels over time.

Yes, it sounds like you need other datasets.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.