why are new optimizer and scheduler generated every iteration?
Maybe each iteration will accumulate the GPU memory. because the optimizer will copy the parameters of the model
in the source code:
for iteration in range(self.iterations):
print("\n\n\nIteration: " + str(iteration + 1))
optimizer = optim.SparseAdam(self.skip_gram_model.parameters(), lr=self.initial_lr)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, len(self.dataloader))
batch_x, batch_y ......
usually we train model like this:
optimizer = optim.SparseAdam(self.skip_gram_model.parameters(), lr=self.initial_lr)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, len(self.dataloader))
for iteration in range(self.iterations):
print("\n\n\nIteration: " + str(iteration + 1))
batch_x, batch_y ......