Hi!
I’ve recently used dgl.dataloading.dataloader
in my training and I found that I can’t reproduce the result.
Every time I run next(iter(dataloader))
it will sample different nodes and graphs.
I’ve already tried the method in Reproducibility, DataLoader: shuffle=True, using seeds but it seems doesn’t work.
I would like to know if there’s any solution to this, thanks!
Hi @Vincent,
I might know why you have this issue. For me, the previous solution still works
fix_seed(10)
# first snippet of code
sampler = dgl.dataloading.MultiLayerNeighborSampler([15,10])
dataloader = dgl.dataloading.DataLoader(
graph = g, # the graph
indices = g.nodes(), # The node IDs to iterate over in minibatches
graph_sampler = sampler, # the neighbor sampler -> how we will sample train_mask neighborhood
batch_size = 256, # size of the batch
shuffle = True, # wether to shuffle or not at each batch
drop_last = False, # wether to keep or to drop the last incomplete batch
)
# second snippet of code
for batch in tqdm.tqdm(dataloader):
input_nodes,output_nodes, block = batch
print(input_nodes,output_nodes, block)
break
with
def fix_seed(seed):
'''
Args :
seed : fix the seed
Function which allows to fix all the seed and get reproducible results
'''
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
dgl.seed(seed)
dgl.random.seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.benchmark = False
torch.use_deterministic_algorithms(True)
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
torch.set_num_threads(1)
If you put the first snippet of code in the same cell as the second, you will only need once of fix_seed
, but if you split the first snippet and the second snippet of code in different cells, then you will need to put fix_seed
in both cells.
Be careful, if you run twice the second snippet of code without running again the first one, you will not see the same nodes appears, it is because you are still browsing your dataloader. In order to always have the same nodes, you’ll need to reinitialize your dataloader.
I hope it helps, tell me if ti works.
(I am not from dgl support, I was the author of the topic you mentionned. I am just trying to help here )
Hi @aure_bnp ,
You save my day! Previously I only use fix_seed()
function when generating dataloader. I run next(iter(dataloader))
in another cell so no wonder it didn’t work!
Now I have a new question which is I need to put fix_seed()
in every single cell to make it fix after generating the dataloader right?
Just like the training part code below:
print('Start training...')
for epoch in tqdm(range(EPOCH)):
start_time = time.time()
print('Training loss:')
model.train()
total_loss = 0
fix_seed(seed)
for _, pos_g, neg_g, blocks in train_dataloader:
Another question is that if I use .py rather than .ipynb to run my code, do I only need to use fix_seed()
in main.py
or do I need to use it in every other file?
Again I really appreciate your help, thanks a lot and I look forward to your reply!!!
Great ! Happy to help
For the first question, I think if you do :
print('Start training...')
fix_seed(seed)
for epoch in tqdm(range(EPOCH)):
start_time = time.time()
print('Training loss:')
model.train()
total_loss = 0
for _, pos_g, neg_g, blocks in train_dataloader:
It should work, but you can check this really fast, i’ll let you do this.
Regarding the second question, I don’t know as I never tried with.py files. But putting it in all files .py will definetly make it works. You should try both, and if it works with the fix_seed
only in the main.py, it’s great, otherwise, put fix_seed
in all the files that browse a dataloader and create one.
Thanks for the response. Other than the solution above, please also try our brand new dataloader graphbolt, tutorial here: 🆕 Stochastic Training of GNNs with GraphBolt — DGL 2.2.1 documentation, it is faster and flexible to use.
Looks great! I’ll try this out. Thanks for the information!
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.