Update one layer's output before feeding to the next layer

I have a GraphConv model as shown below. I want to manually update the output of one layer before feeding it to the next layer. What would be the right way to do this?
I have tried directly updating x (as shown in the forward function). However, I get the error - RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

class GCN(nn.Module):
    def __init__(self, device, args):
        super(GCN, self).__init__()

        self.in_feats = args.get('in_feats', None)
        self.n_hidden = args.get('n_hidden', None)
        self.n_classes = args.get('n_classes', None)
        self.n_layers = args.get('n_layers', None)
        self.layers = nn.ModuleList()

        # check if all the necessary params are not None
        if any([self.in_feats is None, self.n_hidden is None, self.n_classes is None, self.n_layers is None]):
            # raise MissingParamsException("One or more of the following parameters are missing: in_feats, n_hidden, "
            #                              "n_classes, n_layers")
            raise Exception("One or more of the following parameters are missing: in_feats, n_hidden, "
                             "n_classes, n_layers")

        # create the model
        self.layers.append(dglnn.GraphConv(self.in_feats, self.n_hidden, 'both', activation=F.relu))
        for i in range(self.n_layers - 2):
            self.layers.append(dglnn.GraphConv(self.n_hidden, self.n_hidden, 'both', activation=F.relu))
        self.layers.append(dglnn.GraphConv(self.n_hidden, self.n_classes, 'both', activation=F.log_softmax))

        # initialize the model parameters
        for layer in self.layers:
            if isinstance(layer, dglnn.GraphConv):
                layer.reset_parameters()

    def reset_parameters(self):
        if self.weight is not None:
            nn.init.xavier_uniform_(self.weight)
        if self.bias is not None:
            nn.init.zeros_(self.bias)

    def forward(self, blocks, x):
        for l, (layer, block) in enumerate(zip(self.layers, blocks)):
            x = layer(block, x)
            # UPDATE THE OUTPUT OF EACH LAYER BEFORE FEEDING TO THE NEXT LAYER
        return x

Simply write x = torch.sigmoid(x) if you want to add an activation function for example. Maybe you should avoid using inplace operations and ensure that the output of the modification is different than the input.

The operation I am doing is updating a few embeddings in x (say, at index 0, 1, 2) in the following manner -

x[0] = # a tensor
x[1] = # a tensor
x[2] = # a tensor

This operation does raises the aforementioned exception. Any clue what can be done to avoid this?

a=torch.zeros_like(x)
a[3:]=x[3:]
a[0]= # a tensor

a[2]= # a tensor
x=a

Does this work?

I don’t think you can do it, since in the backward process, Torch need the gradient to update model, if you assign a value to the embedding, the gradient will be polluted.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.