Is there a way to remove -1 padding in node2vec_walk

Hi, I am trying to implement Skipgram on node2vec based random walk, when I tested with the Cora dataset it works perfectly, and with my own dataset with 5M edges the generation of walks always pad with -1, which does not exist in the embedding table. is there any way to avoid this behavior?
Thanks

Usually in NLP there’s an UNK token, you can have similar things in node2vec models. You can set the padding_idx in Embedding — PyTorch 1.9.1 documentation to -1 for this scenario to support -1 as index

padding_idx is an index I tried to set it to -1, then it will set the last vector to zeros i.e; the last node. so, I augmented the num_of embedding and set the N+1 nodes to padding_idx.

self.embedding = nn.Embedding(self.N + 1, embedding_dim, sparse=use_sparse, padding_idx=self.N)

in the training loop, I replaced -1 by N+1.

pos_start, pos_rest = pos_trace[:, 0], pos_trace[:, 1:].contiguous()  # start node and following trace
pos_start[pos_start == -1] = self.N
pos_rest[pos_rest == -1] = self.N

Thanks for your help @VoVAllen.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.