Hi there.
I’m migrating code that was originally written using DGL1.X when neighbor sampling was done with dgl.contrib.sampling.NeighborSampler
.
The output of the sampled graph was a generator that generated objects of the NodeFlow type. With NodeFlow, it was possible to access graph data simply with layers[i].data
. For example, below is my old code for message passing on the subgraph after neighbor sampling (which is from a documentation that apparently no longer exists; see: What's the best practice to train node embeddings for a big graph):
def encode(self, data_flows, training=True):
# print(data_flows)
x = self.embeddings
nf = next(iter(data_flows))
nf.copy_from_parent()
nf.layers[0].data['activation'] = x[nf.layers[0].data['feature']]
for i, layer in enumerate(self.layers):
h = nf.layers[i].data.pop('activation')
h = F.dropout(h, p=self.dropout, training=training)
nf.layers[i].data['h'] = h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
lambda node : {'h': node.mailbox['m'].mean(dim=1)},
layer)
h = nf.layers[-1].data.pop('activation')
return h
But, now, with graphbolt, it seems obtaining the graph data and aggregating it is not so simple. next(iter(subgraphs)).sampled_subgraphs[0].sampled_csc
results in a sparse matrix of csc format, but I still don’t know how to proceed with aggregation because the copy_from_parent()
attribute no longer exists and 'CSCFormatBase' object has no attribute 'layers'
May I request some guidance on how to handle the output from graphbolt’s neighbor sampler for message passing on the subgraph?
Thanks in advance.