Hello all, after reviewing the DGL documentation I have a quick question about inference on unseen nodes.
I have a scenario where I plan to train a model on a graph e.g. {a -> [b, c], b -> [c], c -> []}, but at inference time I will need to predict on a new node d (that has incoming edges from b and c, say). The node d simply doesn’t exist at training time (it is in the future), but when node d occurs i have access to the incoming edge structure for d, and the predecessor nodes of d are all available while the graph trains. I.e., the predecessor nodes of d are guaranteed to exist as part of the graph at training time, though node d itself does not exist at that time.
Phrased differently: when node d’s node-level features are known to me in the future, i want to generate a prediction by (explicitly or implicitly) adding node d to the graph along with the edges from b and c to this new node d, and then computing a y_hat for d. At that point the graph or implied graph is:
{a -> [b, c], b -> [c, d], c -> [d], d -> []}
(with new edges from [b and c] to d, and a new node d.)
What I definitely can not afford to do is retrain the model from scratch every time a new node (for which
I need a prediction) arrives.
Is there a cookbook or example anywhere for how this may be accomplished? Or recommendations on the most idiomatic way to approach this? A small code snippet sketching the outline would also be valuable.
Thanks all!