Hello DGL!
I’m working on a link prediction problem with my own graph where nodes are firms (e.g., IBM) and edges indicate employee turnover between them, and of course, this is a directed graph by nature.
My current concern is that some nodes unseen in a training set (e.g., 2010) pop up in a test set (e.g., 2011). In other words, my link prediction for 2011 is based on the graph in 2010.
More specifically, for example, there are 200 firms (nodes) on the graph in 2010, but say 20 new firms appeared in the 2011 graph, meaning there are 220 nodes in 2011.
Here, the problem is the fact that the training set and test set have different numbers of nodes (200 vs. 220), and it doesn’t allow evaluating my model on the test set. (the 20 new firms have no prior “feature” vector, given it is predictive modeling.)
So, is there any way in DGL to deal with this concern?
Thanks!