MultiLayerFullNeighborSampler to mask out test nodes

I’m working on a node classification task with dgl. I observe that MultiLayerFullNeighborSampler also samples nodes outside of training nodes (test nodes) in the src nodes (the final output nodes only contain training nodes, which is correct). This might be fine for some tasks since the model does not see the labels of the test nodes. I wonder whether there is a way to completely block the sampler from sampling test nodes even though they are connected to the training nodes in some way?

Hopefully, the question is clear. Many thanks in advance.

You can make a “training graph” by inducing a subgraph from the original graph and then build the sampler:

train_g = g.subgraph(train_nids)

GraphSAGE example has an inductive setting that follows this idea.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.