Hello! While reading the exact inference docs [6.6 Exact Offline Inference on Large Graphs — DGL 0.9.1post1 documentation], I noticed that the `inference()`

function returns the final layer embeddings of all the nodes in the graph (shown in the code snippet below).

```
def inference(self, g, x, batch_size, device):
"""
Offline inference with this module
"""
# Compute representations layer by layer
for l, layer in enumerate([self.conv1, self.conv2]):
y = torch.zeros(g.number_of_nodes(),
self.hidden_features
if l != self.n_layers - 1
else self.out_features)
sampler = dgl.dataloading.MultiLayerFullNeighborSampler(1)
dataloader = dgl.dataloading.NodeDataLoader(
g, torch.arange(g.number_of_nodes()), sampler,
batch_size=batch_size,
shuffle=True,
drop_last=False)
# Within a layer, iterate over nodes in batches
for input_nodes, output_nodes, blocks in dataloader:
block = blocks[0]
# Copy the features of necessary input nodes to GPU
h = x[input_nodes].to(device)
# Compute output. Note that this computation is the same
# but only for a single layer.
h_dst = h[:block.number_of_dst_nodes()]
h = F.relu(layer(block, (h, h_dst)))
# Copy to output back to CPU.
y[output_nodes] = h.cpu()
x = y
return y
```

I am assuming, to perform the inference on `test set`

we would need to something like `return y[g.ndata['test_mask'].nonzero().squeeze()]`

in the last line of the function. What I am confused about is why do we need to calculate the output embeddings of the whole graph, why would we not calculate it just for the complete *K*-hop neighbourhood of the test nodes?