Update one layer's output before feeding to the next layer

pranjaln · July 9, 2024, 9:46am

I have a GraphConv model as shown below. I want to manually update the output of one layer before feeding it to the next layer. What would be the right way to do this?
I have tried directly updating x (as shown in the forward function). However, I get the error - RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

class GCN(nn.Module):
    def __init__(self, device, args):
        super(GCN, self).__init__()

        self.in_feats = args.get('in_feats', None)
        self.n_hidden = args.get('n_hidden', None)
        self.n_classes = args.get('n_classes', None)
        self.n_layers = args.get('n_layers', None)
        self.layers = nn.ModuleList()

        # check if all the necessary params are not None
        if any([self.in_feats is None, self.n_hidden is None, self.n_classes is None, self.n_layers is None]):
            # raise MissingParamsException("One or more of the following parameters are missing: in_feats, n_hidden, "
            #                              "n_classes, n_layers")
            raise Exception("One or more of the following parameters are missing: in_feats, n_hidden, "
                             "n_classes, n_layers")

        # create the model
        self.layers.append(dglnn.GraphConv(self.in_feats, self.n_hidden, 'both', activation=F.relu))
        for i in range(self.n_layers - 2):
            self.layers.append(dglnn.GraphConv(self.n_hidden, self.n_hidden, 'both', activation=F.relu))
        self.layers.append(dglnn.GraphConv(self.n_hidden, self.n_classes, 'both', activation=F.log_softmax))

        # initialize the model parameters
        for layer in self.layers:
            if isinstance(layer, dglnn.GraphConv):
                layer.reset_parameters()

    def reset_parameters(self):
        if self.weight is not None:
            nn.init.xavier_uniform_(self.weight)
        if self.bias is not None:
            nn.init.zeros_(self.bias)

    def forward(self, blocks, x):
        for l, (layer, block) in enumerate(zip(self.layers, blocks)):
            x = layer(block, x)
            # UPDATE THE OUTPUT OF EACH LAYER BEFORE FEEDING TO THE NEXT LAYER
        return x

mfbalin · July 9, 2024, 9:54am

Simply write x = torch.sigmoid(x) if you want to add an activation function for example. Maybe you should avoid using inplace operations and ensure that the output of the modification is different than the input.

pranjaln · July 9, 2024, 9:57am

The operation I am doing is updating a few embeddings in x (say, at index 0, 1, 2) in the following manner -

x[0] = # a tensor
x[1] = # a tensor
x[2] = # a tensor

This operation does raises the aforementioned exception. Any clue what can be done to avoid this?

mfbalin · July 9, 2024, 10:14am

a=torch.zeros_like(x)
a[3:]=x[3:]
a[0]= # a tensor
…
a[2]= # a tensor
x=a

Does this work?

frozenbugs · July 11, 2024, 2:21am

I don’t think you can do it, since in the backward process, Torch need the gradient to update model, if you assign a value to the embedding, the gradient will be polluted.

system · August 10, 2024, 2:22am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.