Is there a way to remove -1 padding in node2vec_walk

flursky · October 18, 2021, 10:15am

Hi, I am trying to implement Skipgram on node2vec based random walk, when I tested with the Cora dataset it works perfectly, and with my own dataset with 5M edges the generation of walks always pad with -1, which does not exist in the embedding table. is there any way to avoid this behavior?
Thanks

VoVAllen · October 18, 2021, 10:27am

Usually in NLP there’s an UNK token, you can have similar things in node2vec models. You can set the padding_idx in Embedding — PyTorch 1.9.1 documentation to -1 for this scenario to support -1 as index

flursky · October 18, 2021, 10:55am

padding_idx is an index I tried to set it to -1, then it will set the last vector to zeros i.e; the last node. so, I augmented the num_of embedding and set the N+1 nodes to padding_idx.

self.embedding = nn.Embedding(self.N + 1, embedding_dim, sparse=use_sparse, padding_idx=self.N)

in the training loop, I replaced -1 by N+1.

pos_start, pos_rest = pos_trace[:, 0], pos_trace[:, 1:].contiguous()  # start node and following trace
pos_start[pos_start == -1] = self.N
pos_rest[pos_rest == -1] = self.N

Thanks for your help @VoVAllen.

system · November 17, 2021, 10:55am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.