 # How to use DGL with Binary Classification Problem?

Hi,
I’m new to DGL and also to GCN. I’m having a set of graphs with associate binary labels (0 or 1). I implemented a classifier by following the DGL tutorial (https://docs.dgl.ai/en/latest/tutorials/basics/4_batch.html) and ended up with around 70% testing accuracy. However, the tutorial problem is a multiclass one, my one is a binary one. So, could anyone please help me to figure out the areas I need to modify in my classifier to work well with binary classification situations?

I think you just need to set `n_classes` to be 2 in `Classifier`.

Yes. Thank you mufeili. That’s true. I used `n_classes=2` when I’m building the mentioned classifier. But I wanted to know, how to use binary cross-entropy as the loss function with a Sigmoid output layer?

This is how I implemented so far.

``````#Create model
classifier_model = Classifier(1, 256, 1)
loss_func = nn.BCELoss()
classifier_model.train()

epoch_losses = []
for epoch in range(10):
epoch_loss = 0
for iter, (bg, label) in enumerate(train_data_loader):
prediction = classifier_model(bg)
loss = loss_func(torch.sigmoid(prediction), label.float())
loss.backward()
optimizer.step()
epoch_loss += loss.detach().item()
epoch_loss /= (iter + 1)
print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
epoch_losses.append(epoch_loss)

classifier_model.eval()

total = 0
y_pred = []
y_true = []

graphs, labels = data
outputs = torch.softmax(classifier_model(graphs), 1)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
y_pred.append(predicted)
y_true.append(labels)

#Print results
print("Accuracy: %.2f%%" % ((accuracy_score(y_true, y_pred, normalize=False)) / total * 100))
print("Precision: %.2f%%" % (metrics.precision_score(y_true, y_pred) * 100))
print("Recall: %.2f%%" % (metrics.recall_score(y_true, y_pred) * 100))
print("F1-Score: %.2f%%" % (metrics.f1_score(y_true, y_pred) * 100))`
``````

However, my model performs as follows.

``````    Accuracy: 55.56%
Precision: 0.00%
Recall: 0.00%
F1-Score: 0.00%
``````

Can anyone please tell me where I did the mistake?

I’d suggest use `BCEWithLogitsLoss` rather than `BCELoss` as the latter one can have issues for numerical stability. As for the model performance, you need to check the following issues:

1. How balanced is your dataset? What proportion of the dataset are positive/negative?
2. Did you try tunning the hyperparameters?
3. How did you initialize node features?

I tried with `BCEWithLogitsLoss` but getting the same results.

1. It’s 50%, 50% data source for both label 0 and 1

2. I did some tuning, but my problem is when I use `nn.CrossEntropyLoss()` as the loss function with `n_classes` as `2`, I’m getting following performance. But it’s nothing with `BCEWithLogitsLoss` or `BCELoss`. Do I need to any modifications in the classifier itself? I’m using the same which is mentioned in the tutorial.

` #performance when CrossEntropyLoss is used as the loss function (epoch = 100)`
`Accuracy: 88.89%`
`Precision: 100.00%`
`Recall: 75.00%`
`F1-Score: 85.71%`

This is the classifier code:

``````    class Classifier(nn.Module):
def __init__(self, in_dim, hidden_dim, n_classes):
super(Classifier, self).__init__()

self.layers = nn.ModuleList([
GCN(in_dim, hidden_dim, F.relu),
GCN(hidden_dim, hidden_dim, F.relu),
GCN(hidden_dim, hidden_dim, F.relu),
GCN(hidden_dim, hidden_dim, F.relu)])
self.classify = nn.Linear(hidden_dim, n_classes)
self.dropout = nn.Dropout(p=0.2)

def forward(self, g):
# For undirected graphs, in_degree is the same as
# out_degree.
h = g.in_degrees().view(-1, 1).float()
for conv in self.layers:
h = conv(g, h)
g.ndata['h'] = h
hg = dgl.mean_nodes(g, 'h')
return self.classify(hg)
``````
1. node features are in a N*d matrix (N- no of nodes, d- node features). All nodes doesn’t carry all features, in that case it will represent with 0 values. Also, the d is not same for all the graphs. It’s changing (N is also the same). In following code line I’m assigning matrix to the graphs using DGL library.
`g.ndata['features'] = torch.Tensor(list(X)) #X is the N*d feature matrix and it's a numpy array`

Since I’m getting good results with `Cross-entropy` I’m highly trusted that the classifier need to be changed little bit to work with `BCELoss()`. But I couldn’t find how to do that. Plz help.

Could you please try printing the shape & value of your prediction and labels and also check the result returned by `BCEWithLogitsLoss`?