Hello! I found sampling in train_cv,py (~22sec per epoch) is much slower than train_sampling.py (~2sec per epoch) on reddit dataset. In train_cv.py, generating history blocks will take extra time. In the following test, the sample_history is ~20sec per epoch.
def sample_blocks(self, seeds): seeds = th.LongTensor(seeds) blocks =  hist_blocks =  for fanout in self.fanouts: frontier = dgl.sampling.sample_neighbors(self.g, seeds, fanout) block = dgl.to_block(frontier, seeds) tic = time.time() hist_frontier = dgl.in_subgraph(self.g, seeds) hist_block = dgl.to_block(hist_frontier, seeds) toc = time.time() seeds = block.srcdata[dgl.NID] blocks.insert(0, block) hist_blocks.insert(0, hist_block) self.sample_history += toc - tic return blocks, hist_blocks
- batch size: 6000
- fan out: 2, 2
- num_work: 0
- others: default
Fan_out is 2, 2, and the copy in in_subgraph is lazy. Why these steps are so slow? I wonder how to generate hist_block more effectively. Thanks!