The classification accuracy of DGCNN based on DGL is much lower than that based on stellargraph

Hey! I read your community’s reply on building DGCNN, and also learned about DGCNN. Here is codes on building DGCNN. One uses DGL and another uses stellargraph.

I keep their super parameters, training datasets and Validation datasets consistent. However, the stellargraph-based DGCNN’s classification accuracy(0.93) is much higer than the DGL-based(0.83). I wonder what caused this and hope to make DGL-based DGCNN more accurate in its classification.


class Classifier(nn.Module):
def init():
super(Classifier, self).init(n_classes)

  self.gcn1 = GraphConv(59, 1024, norm='right')
  self.gcn2 = GraphConv(1024,1024, norm='right')
  self.gcn3 = GraphConv(1024, 512, norm='right')

  self.jk = JumpingKnowledge()
  self.sortpooling = SortPooling(k=10)
  self.conv1D_1 = nn.Conv1d(1, 256, kernel_size=3584, stride=3584)
  self.maxpooling = nn.MaxPool1d(2)
  self.conv1D_2 = nn.Conv1d(256, 512, kernel_size=5)
  self.tmp = math.floor(((10* 3584 - 3584) / 3584 + 1) / 2) - (
  		5 - 1)
  self.fc1 = nn.Linear(256*self.tmp, 1024)
  self.classify = nn.Linear(1024, n_classes)
  self.softmax = nn.Softmax(dim=1)
def forward(self, g, h):
	h1 = torch.tanh(self.gcn1(g, h))
	h1 = h1.flatten(1)
	h2 = torch.tanh(self.gcn2(g, h1))
	h2 = h2.flatten(1)
	h3 = torch.tanh(self.gcn2(g, h2))
	h3 = h3.flatten(1)
	h4 = torch.tanh(self.gcn3(g, h3))
	h4 = h4.flatten(1)
	# h =, h2, h3, h4), dim=1)
	h = self.jk([h1, h2, h3, h4])
	h = self.sortpooling(g, h)
	h = h.view(256, 1, 35840)

	# conv+pool
	h = self.conv1D_1(h)
	h = self.maxpooling(h)
	h = self.conv1D_2(h)

	h = h.flatten(1)
	h = F.relu(self.fc1(h))
	h = F.dropout(h, p=0.25)
	h = self.classify(h)

	with g.local_scope():
		return self.softmax(h)

stellargraph-based DGCNN

generator = PaddedGraphGenerator(graphs=graphs)

layer_sizes = [1024, 1024, 1024, 512]

dgcnn_model = DeepGraphCNN(
activations=[“tanh”, “tanh”, “tanh”, “tanh”],
x_inp, x_out = dgcnn_model.in_out_tensors()

x_out = Conv1D(filters=256, kernel_size=sum(layer_sizes), strides=sum(layer_sizes))(x_out)
x_out = MaxPool1D(pool_size=2)(x_out)

x_out = Conv1D(filters=512, kernel_size=5, strides=1)(x_out)

x_out = Flatten()(x_out)

x_out = Dense(units=1024, activation=“relu”)(x_out)
x_out = Dropout(rate=0.25)(x_out)

predictions = Dense(units=len(apps), activation=“softmax”)(x_out)

model = Model(inputs=x_inp, outputs=predictions)

Hi, I don’t think DGL has an official example of DGCNN so could you let me know where is your code based on? The high level suggestion when comparing two implementation is to start from the simplest. For example, starting from the three GCN layers in your example and gradually add layers to find divergence. Usually the pitfall is some subtle differences, e.g., layer size, learning rate, etc.