Cannot update column of scheme x using feature of scheme y

Hi All,

I got an error as title and my gnn is very similar with batch example of gcn:

https://docs.dgl.ai/tutorials/basics/4_batch.html

class NodeApplyModule(nn.Module):
    """Update the node feature hv with ReLU(Whv+b)."""
    def __init__(self, in_feats, out_feats, activation):
        super(NodeApplyModule, self).__init__()
        self.linear = nn.Linear(in_feats, out_feats)
        self.activation = activation

    def forward(self, node):
        h = self.linear(node.data['h'])
        h = self.activation(h)
        return {'h' : h}

class GCN(nn.Module):
    def __init__(self, in_feats, out_feats, activation):
        super(GCN, self).__init__()
        self.apply_mod = NodeApplyModule(in_feats, out_feats, activation)

    def forward(self, g, feature):
        # Initialize the node features with h.
        g.ndata['h'] = feature
        g.update_all(msg, reduce)
        g.apply_nodes(func=self.apply_mod)
        return g.ndata.pop('h')

the only different is the initial input ():
batch gcn is h = g.in_degrees().view(-1, 1).float()
and mine is h = g.ndata['attr'] which has already exist before batched the graphs.

I don’t know why the example can work since it update g.ndata[‘h’] from n_node X 1 to n_node X 256. In my case the apply_mod update g.ndata[‘h’] from n_node X 7 to n_node X 64.

I try to use

from dgl.frame import Scheme                                                                                                     
nodes.data.schemes['h'] = Scheme(shape=(64,), dtype=torch.float32)

to bypass this problem but it still doesn’t work.

very urgent due day…

thanks!

update g.ndata[‘h’] from n_node X 1 to n_node X 256

I don’t think dgl can change the data scheme once created. I’m not quite sure about the case you described.

However, if you have different data shapes, just using another new key name for ndata would work.

if I do that, the code became very ugly since the reduce func need to be changed to adapt different attr keys and there are some if else sentences to change them

why the batch gcn example can work?

My previous response is not accurate. update_all function can rewrite the whole scheme associated with the key name. If you want to override and abandon the data under previous key, you can use del g.ndata['h'] or g.ndata.pop('h') to remove the original field first, and then use g.ndata['h']=... to set new values.

1 Like

it seems an acceptable solution, thank you!

anyway, when I have time I still want to figure out why the example can work…

I also get the same error with my custom dataset. However this issue is most probably because of the current issue with pickle unable to serialize torch datatypes properly. I tried a simple experiment.
Here it is.

# feature modifier function
def modify_feature(nodes): 
    m = nn.Linear(1, 10)    
    output = m(nodes.data['x'])    
    return {'x': output}
# create graph
g = dgl.DGLGraph()
g.add_nodes(4)
# dump the graph
with open('test.pkl', 'wb') as file:
    pickle.dump(w, file) 
g.ndata['x'] = th.ones(4, 1)
g.apply_nodes(func=increment_feature)
print(g.ndata['x'].shape)
# works 
>>> torch.Size([4, 10])

# read the graph again
with open("test.pkl", 'rb') as file:
    w = pickle.load(file)
w.ndata['x'] = th.ones(4, 1)
w.apply_nodes(func=increment_feature)
# error 
>>> Cannot update column of scheme Scheme(shape=(10,), dtype=torch.float32) using feature of scheme Scheme(shape=(1,), dtype=torch.float32).

For now I build and store all graphs in program context .
The tutorial works because all graphs are in main memory.
Hope this issue gets resolved soon!

Hi,

Could you try the latest DGL (may need to be installed by source build)?

We’ve fixed one pickle problem in this PR.

Hope this could help!

I met an error when building it and open a new question thread for that. Build error with Cython

updating column works after I built the newest source code without cython.