Hi
Thanks for this great tool! When I tried neighborhood sampling by following the tutorial at Node Classification with Neighborhood Sampling, I found that the performance with neighborhood sampling was inferior to that with full graph processing (acc 83% vs 88%). Part of the code is as follows:
With neighborhood sampling:
# initialize graph
cur_best = 0
dur = []
for epoch in range(args.n_epochs):
# print(epoch)
model.train()
if epoch >= 3:
t0 = time.time()
loss = torch.Tensor([0.]).to(device)
# forward
for input_nodes, output_nodes, blocks in train_dataloader:
blocks = [b.to(device) for b in blocks]
h = blocks[0].srcdata['feat']
h = model(blocks, h)
logits = h
# print(logits)
loss = loss + loss_fcn(logits, blocks[-1].dstdata['label'])
# print(loss)
loss = loss / len(train_dataloader)
optimizer.zero_grad()
loss.backward()
optimizer.step()
Full graph processing:
# initialize graph
dur = []
for epoch in range(args.n_epochs):
model.train()
if epoch >= 3:
t0 = time.time()
# forward
logits = model(features)
loss = loss_fcn(logits[train_mask], labels[train_mask])
optimizer.zero_grad()
loss.backward()
optimizer.step()
I am curious why this would occur, and how to improve the performance of the model with neighborhood sampling? I will truly appreciate your help. Thank you in advance!
Best,
Yongcheng