About setting initializer


#1

I’m trying to write a GCN model follow the instruction provided by documentation here Graph Convolutional Network.

I chose mxnet as my backend, convert pytorch example code to gluon. But when I train the model, a warning said I have not set initializer

/data/anaconda3/envs/insis_template_3.6/lib/python3.6/site-packages/dgl/frame.py:204: UserWarning: Initializer is not set. Use zero initializer instead. To suppress this warning, use `set_initializer` to explicitly specify which initializer to use.
  dgl_warning('Initializer is not set. Use zero initializer instead.'

I don’t know what the initializer mean, my source code is:

import os
os.environ['DGLBACKEND'] = 'mxnet'

import dgl
import dgl.function as fn
import mxnet as mx
from mxnet.gluon import nn
from mxnet import nd
from dgl import DGLGraph
from mxnet import gluon

gcn_msg = fn.copy_src(src = 'h', out = 'm')
gcn_reduce = fn.sum(msg = 'm', out = 'h')

class NodeApplyModule(nn.Block):
    def __init__(self, out_feats, activation):
        super(NodeApplyModule, self).__init__()
        self.linear = nn.Dense(out_feats)
        self.activation = activation

    def forward(self, node):
        h = self.linear(node.data['h'])
        h = self.activation(h)
        return {'h' : h}

class GCN(nn.Block):
    def __init__(self, out_feats, activation):
        super(GCN, self).__init__()
        self.apply_mod = NodeApplyModule(out_feats, activation)

    def forward(self, g, feature):
        g.ndata['h'] = feature
        g.update_all(gcn_msg, gcn_reduce)
        g.apply_nodes(func = self.apply_mod)
        return g.ndata.pop('h')

class Net(nn.Block):
    def __init__(self):
        super(Net, self).__init__()
        self.gcn1 = GCN(16, nd.relu)
        self.gcn2 = GCN(7, nd.relu)

    def forward(self, g, features):
        x = self.gcn1(g, features)
        x = self.gcn2(g, x)
        return x
    
net = Net()
net.initialize(init = mx.init.Xavier())

from dgl.data import citation_graph as citegrh
def load_cora_data():
    data = citegrh.load_cora()
    data.graph.add_edges_from([(i,i) for i in range(len(data.graph))])
    features = nd.array(data.features)
    labels = nd.array(data.labels)
    mask = nd.array(data.train_mask)
    g = DGLGraph(data.graph, readonly = True)
    return g, features, labels, mask

import time
import numpy as np
g, features, labels, mask = load_cora_data()

optimizer = gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': 1e-3})
loss = gluon.loss.SoftmaxCrossEntropyLoss()

dur = []
for epoch in range(1000):
    t0 = time.time()
        
    with mx.autograd.record():
        pred = net(g, features)
        l = loss(pred, labels, nd.expand_dims(mask, 1))

    l.backward()
    optimizer.step(batch_size = 1)

    dur.append(time.time() - t0)

    print("Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f}".format(epoch, l.mean().asscalar(), np.mean(dur)))

#2

Hi Davidham3,

Thank you for your feedback. The issue arises with line g.update_all(gcn_msg, gcn_reduce). The problem is that in dgl we have many tables for things like node features, edge features and messages sent/received for message passing. As some nodes do not have incoming/outcoming edges, we need some placeholder for this large table, and the default place holder will be a zero tensor whose size matches existing rows. When the default zero tensor is used, a warning arises. We should probably avoid generating these warnings if this is just for internal use and we are sorry if this bothers you a bit.

Meanwhile, when only some but not all nodes/edges have features, you do may want to set an initializer for them. Check set_n_initializer and set_e_initializer.


#3

Thanks a lot. After setting the node initializer, the warning disappeared.