For a big graph, node embeddings is too big to be fed into GPU as model parameters. Instead, i set “requires_grad” of node embeddings to be “True” in forward procedure in examples/pytorch/sampling/gcn_ns_sc.py. However, in this way, optimizer in Pytorch can not capture this trainable parameters and node embedding are still invariant. In fact, a feasible way is to use “torch.nn.Embedding” layer and node features are indexs in the embedding table. A problem for this is “torch.nn.Embedding” is too big to be fed into GPU memory. So, i want to know if there is a simple way to solve this problem. Thanks for your sharing.
def forward(self, nf):
nf.layers[0].data['activation'] = nf.layers[0].data['features']
**nf.layers[0].data['activation'].requires_grad = True**
for i, layer in enumerate(self.layers):
h = nf.layers[i].data.pop('activation')
if self.dropout:
h = self.dropout(h)
nf.layers[i].data['h'] = h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
lambda node : {'h': node.mailbox['m'].mean(dim=1)},
layer)
h = nf.layers[-1].data.pop('activation')
return h