I have a trouble with memory consumption. If I change the edge classification predictor output dimension from five to one, the cuda memory consumption will get an unnormally increase.

I follow the tutorial and write the predictor code:

```
class MLPPredictor(nn.Module, ABC):
def __init__(self,
in_units,
num_classes,
dropout_rate=0.0):
super(MLPPredictor, self).__init__()
self.dropout = nn.Dropout(dropout_rate)
self.predictor = nn.Sequential(
nn.Linear(in_units * 2, in_units, bias=False),
nn.Tanh(),
nn.Linear(in_units, num_classes, bias=False),
)
self.reset_parameters()
def reset_parameters(self):
for p in self.parameters():
if p.dim() > 1:
nn.init.xavier_uniform_(p)
def apply_edges(self, edges):
h_u = edges.src['h']
h_v = edges.dst['h']
score = self.predictor(th.cat([h_u, h_v], dim=1))
return {'score': score}
def forward(self, graph, ufeat, ifeat):
graph.nodes['movie'].data['h'] = ifeat
graph.nodes['user'].data['h'] = ufeat
with graph.local_scope():
graph.apply_edges(self.apply_edges)
return graph.edata['score']
```

If the variable num_class is five, the memory usage after counting CrossEntropy loss is about 1401M.

If I change the num_class to one, the memory usage after counting MSE will increase to 11623M.

Is there anyone meet the same situation? Can you help me figure out the reason of increased memory consumption? Or can you give me some possible solutions or methods to find out the reason?