Getting started with multiple node features in homogenous graph

HashRocketSyntax · April 25, 2020, 1:43pm

Looking for a bit of direction and understanding here. I’ve spent a few nights comparing various PyTorch examples to the various DGL examples. I have not been able to dissect meaning from the Hetero example in the docs.

Here is the ndata of a basic 3 node graph with 2 features. I am using this simple graph to feel out the library.

Features in ndata

g.ndata = {
    # continuous feature
    'n_weight': tensor([
        [1.11],
        [2.22],
        [3.33]
    ]), 
    # categorical/ discrete feature
    'n_community': tensor([
        [1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]
    ])
}

Defining NN that will take both features into account
(I know it’s heinous, I’m just trying to understand)

class GraphClassifier(nn.Module):
    def __init__(self, in_dim, hidden_dim, n_classes):
        super(GraphClassifier, self).__init__()
        self.conv1 = GraphConv(in_dim, hidden_dim)
        self.conv2 = GraphConv(hidden_dim, hidden_dim)
        # flatten into linear so we can crossentropy/ softmax it.
        self.classify = nn.Linear(hidden_dim, n_classes)

        
    def forward(self, g):

        # run the weight feature through the net
        w = g.ndata['n_weight']
        w = F.relu(self.conv1(g, w))
        w = F.relu(self.conv2(g, w))
        g.ndata['n_weight'] = w
        
        # run the community feature through the net
        c = g.ndata['n_community']
        c = F.relu(self.conv1(g, c))
        c = F.relu(self.conv1(g, c))
        g.ndata['n_community'] = c
        
        # combine both features into one tensor
        wc = torch.cat((w, c), 1)
        return self.classify(wc)

Questions

Q1: mapping the features
– In my forward, is it possible to access the features as attributes of g like so g.ndata['key']?
– Or should I do I need to do a self.features = nn.Sequential(...layers...) and pass them into forward like def forward(self, g, n_weight, n_community)?
– Or could I use nn.Parameters('ndata') to get it into forward?
– Just need a standard approach.
Q2: reducing to an output
In the wc above, I am trying to combine the different features in the forward pass. If I understand correctly, I need to feed each feature through the network and then flatten them somehow so Linear can run on them?

minjie · April 26, 2020, 3:42am

Q1: mapping the features
– In my forward , is it possible to access the features as attributes of g like so g.ndata['key'] ?
– Or should I do I need to do a self.features = nn.Sequential(...layers...) and pass them into forward like def forward(self, g, n_weight, n_community) ?
– Or could I use nn.Parameters('ndata') to get it into forward?
– Just need a standard approach.

Since our DGLGraph has many APIs, we don’t save features as attributes. g.ndata['key'] is the correct way to get/set features. There is no standard way in writing an NN module. However in DGL, we do have a convention to put feature inputs explicitly as function arguments. We find it makes the function more self-explained (e.g., “The module takes a graph, two feature tensors for node weight and community and returns a prediction”). You could take a look at our GraphConv implementation

Q2: reducing to an output
In the wc above, I am trying to combine the different features in the forward pass. If I understand correctly, I need to feed each feature through the network and then flatten them somehow so Linear can run on them?

Your code makes total sense to me. I will do the same .

HashRocketSyntax · April 27, 2020, 12:26am

Thank you @minjie, it was greatly reassuring to learn that I was on the right track. I also spent some time reading through that GraphConv source code.

Here is where I am at after several hours of trial and error. Things become unclear for me when I get a prediction by feeding my combined feature tensor into the linear layer.

Model

class Classifier(nn.Module):
    def __init__(self, in_dim, hidden_dim, n_classes, n_features):
        super(Classifier, self).__init__()
        
        self.conv1 = GraphConv(in_dim, hidden_dim)

        # using `*n_features` below to enable the combined feature tensor.
        self.classify = nn.Linear(hidden_dim*n_features, n_classes)

        
    def forward(self, g, nodes_weights, nodes_communities):
               
        nw = nodes_weights
        nw = F.relu(self.conv1(g, nw))
        print(nw)
        print("^---conv_weights---^")
        
        nc = nodes_communities
        nc = F.relu(self.conv1(g, nc))
        print(nc)
        print("^---conv_communities---^")
        
        
        nwc = th.cat((nw, nc), 1)
        print(nwc)
        print("^---combined_weight_community---^")
        #return self.classify(nwc)
    
        # (1) *** See question at bottom of comment ***

        # Given a tensor w a row for each node, 
        # take the average by column,
        # and return a single row to feed into linear.
        nwc_mean = nwc.mean(dim=0)
        print(nwc_mean)
        print("^---averaged_weight_community---^")
        
        return self.classify(nwc_mean)

Train
(just using a batch_size of 1 for now)

import torch.optim as optim

model = Classifier(
    in_dim=1,
    hidden_dim=3,
    n_classes=num_classes,
    n_features=2 # added this to deal with concatenated tensors
)

# loss_func = nn.BCEWithLogitsLoss()
loss_func = nn.CrossEntropyLoss()


optimizer = optim.Adam(model.parameters(), lr=0.010)#0.001
model.train()

epochs = 30

epoch_losses = []

for epoch in range(epochs):
    epoch_loss = 0
    for iter, (g, l) in enumerate(dataset_train):
        
        print(g)
        print("^---batched_graph---^")
        
        nodes_weights = g.ndata['n_weight']
        nodes_communities = g.ndata['n_community']
        prediction = model(g, nodes_weights, nodes_communities)
        
        print(prediction)
        print("^---pred---^")
        print(l)
        print("^---label---^")
        

        # (2) *** See question at bottom of comment ***
        loss = loss_func(prediction, l)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        epoch_loss += loss.detach().item()
    epoch_loss /= (iter + 1)
    print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
    epoch_losses.append(epoch_loss)

Printed Outputs

DGLGraph(num_nodes=3, num_edges=2,
         ndata_schemes={'n_weight': Scheme(shape=(1,), dtype=torch.float32), 'n_community': Scheme(shape=(1,), dtype=torch.int64)}
         edata_schemes={})
^---single_batched_graph---^

tensor([[0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.1569],
        [0.0000, 0.0000, 0.1712]], grad_fn=<ReluBackward0>)
^---weight_conv---^

tensor([[0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.1427],
        [0.0000, 0.0000, 0.0000]], grad_fn=<ReluBackward0>)
^---community_conv---^

tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.1569, 0.0000, 0.0000, 0.1427],
        [0.0000, 0.0000, 0.1712, 0.0000, 0.0000, 0.0000]],
       grad_fn=<CatBackward>)
^---weight_community_combined---^

tensor([0.0000, 0.0000, 0.1094, 0.0000, 0.0000, 0.0476],
       grad_fn=<MeanBackward1>)
^---weight_community_averaged---^

tensor([ 0.2296, -0.4287], grad_fn=<AddBackward0>)
^---pred---^

tensor([1])
^---label---^

When I passed all 3 rows into linear, my prediction contained 3 rows, which obviously caused an error when I compared it to the label that has 1 row. So I tried to average it down to a single row… not sure if this is what I am supposed to do.
My batch contains a single graph with 3 nodes. I have two classes (which i know works with softmax). What should the dimensions of prediction and l be when I pass them into loss_func? I am only getting one row from pred when I should be getting 2?

minjie · April 27, 2020, 2:59am

For graph classification task, after you computed the node and edge representation, you need to perform a readout operation to get a graph level representation. There are many ways to do that such as averaging, summing all the node predictions or even by attention.

Checkout our tutorial about graph classification: https://docs.dgl.ai/tutorials/basics/4_batch.html
Checkout our readout operators for some quick prototype.
Checkout our global pooling NN modules for various kinds of readout.

HashRocketSyntax · April 28, 2020, 2:08am

Thank you @minjie. Here is a working example of a model with multiple node features. I hope it will serve as a guide for future graph scientists. On to edge values =)

Model

from dgl.nn.pytorch import GraphConv
import torch.nn as nn
import torch.nn.functional as F

class Classifier(nn.Module):
    def __init__(self, in_dim, hidden_dim, n_classes):
        super(Classifier, self).__init__()
        
        self.conv1 = GraphConv(in_dim, hidden_dim)
        self.conv2 = GraphConv(hidden_dim, hidden_dim)
        # then flatten into linear so we can crossentropy/ softmax it.
        self.classify = nn.Linear(hidden_dim, n_classes)

        
    def forward(self, g, nodes_weights, nodes_communities):
        
        # should I still use these degrees as a useful feature?
        #d = g.in_degrees().view(-1, 1).float()
        
        nw = nodes_weights
        nw = F.relu(self.conv1(g, nw))
        nw = F.relu(self.conv2(g, nw))
        g.ndata['nw'] = nw
        nw = dgl.mean_nodes(g, 'nw')
        
        nc = nodes_communities
        nc = F.relu(self.conv1(g, nc))
        nc = F.relu(self.conv2(g, nc))
        g.ndata['nc'] = nc
        nc = dgl.mean_nodes(g, 'nc')
            
        # would it be better to average these tensors by row?
        nwc = nw.add(nc)
        
        return self.classify(nwc)

Train

import torch.optim as optim

model = Classifier(
    in_dim=1,
    hidden_dim=3,
    n_classes=num_classes,
)
loss_func = nn.CrossEntropyLoss()


optimizer = optim.Adam(model.parameters(), lr=0.010)#0.001
model.train()

epochs = 30

epoch_losses = []

for epoch in range(epochs):
    epoch_loss = 0
    for iter, (g, l) in enumerate(dataset_train): 
        
        nodes_weights = g.ndata['n_weight']
        nodes_communities = g.ndata['n_community']
        prediction = model(g, nodes_weights, nodes_communities)

        loss = loss_func(prediction, l)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        epoch_loss += loss.detach().item()
    epoch_loss /= (iter + 1)
    print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
    epoch_losses.append(epoch_loss)