# Exact Inference

Hello! While reading the exact inference docs [6.6 Exact Offline Inference on Large Graphs — DGL 0.9.1post1 documentation], I noticed that the `inference()` function returns the final layer embeddings of all the nodes in the graph (shown in the code snippet below).

``````def inference(self, g, x, batch_size, device):
"""
Offline inference with this module
"""
# Compute representations layer by layer
for l, layer in enumerate([self.conv1, self.conv2]):
y = torch.zeros(g.number_of_nodes(),
self.hidden_features
if l != self.n_layers - 1
else self.out_features)
g, torch.arange(g.number_of_nodes()), sampler,
batch_size=batch_size,
shuffle=True,
drop_last=False)

# Within a layer, iterate over nodes in batches
for input_nodes, output_nodes, blocks in dataloader:
block = blocks[0]

# Copy the features of necessary input nodes to GPU
h = x[input_nodes].to(device)
# Compute output.  Note that this computation is the same
# but only for a single layer.
h_dst = h[:block.number_of_dst_nodes()]
h = F.relu(layer(block, (h, h_dst)))
# Copy to output back to CPU.
y[output_nodes] = h.cpu()

x = y

return y
``````

I am assuming, to perform the inference on `test set` we would need to something like `return y[g.ndata['test_mask'].nonzero().squeeze()]` in the last line of the function. What I am confused about is why do we need to calculate the output embeddings of the whole graph, why would we not calculate it just for the complete K-hop neighbourhood of the test nodes?

It depends on your need, in this example, the inference needs to read embeddings of many nodes from the graph, so it is faster to do so. While if you just need to inference for a small amount of data, sampling is also fine.