Error in GraphConv when processing more than 11 nodes

I’m trying to move from my in-house graph library to dgl. I’ve implemented a simple model and it seems graph layers can not handle the data properly. In particular, when I use GraphConv, the model blows up for molecules with more than 11 nodes. When I switch to SAGEConv, it blows up for molecules with more than 7 nodes. So in the former case, implementing

features = g.ndata['attr']
if features.shape[0] < 12:
    logits = model(g, features, pool_op)

fixes the issue. But this makes no sense. Here is the minimum working code:

from dgl.nn import GraphConv
import torch.nn as nn
import torch.nn.functional as F
import dgl.data
import torch


dataset = dgl.data.QM9EdgeDataset(label_keys=['mu', 'gap', 'homo'])

class GCN(nn.Module):
    def __init__(self, in_feats, h_feats, o_feats, hidden_dim, out_dim, d_prob=0.15):
        super(GCN, self).__init__()
        self.conv1 = GraphConv(in_feats, h_feats)
        self.conv2 = GraphConv(h_feats, o_feats)
        self.fc1 = nn.Linear(o_feats, hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, out_dim)

        self.d_prob = d_prob

    def forward(self, g, in_feat, pool_op):
        h = self.conv1(g, in_feat)
        h = self.conv2(g, h)
        g.ndata['h'] = h
        o = dgl.readout_nodes(graph=g, feat='h', op=pool_op)

        x = F.relu(self.fc1(o))
        x = F.dropout(x, p=self.d_prob)
        x = self.fc2(x)

        return x


in_feats = 11
h_feats = 128
o_feats = 64
hidden_dim = 128
out_dim = 5

pool_op = 'sum'

model = GCN(in_feats, h_feats, o_feats, hidden_dim, out_dim, d_prob=0.15)

for idx, (g, label) in enumerate(dataset):
    features = g.ndata['attr']
    logits = model(g, features, pool_op)

    print(idx)

print(g.edata)

Here is the error that I get:

Done loading data from cached files.
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Would appreciate any inputs.

Hi @blade, your code looks fine and it works well in my VM. I guess it could be caused by your environment, could you check it? Post mine for reference:

  • DGL version: 0.9.0
  • Pytorch version: 1.12.0
  • VM memory: 62 GB

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.