Inference on large graph

navmarri · June 12, 2020, 5:46pm

I’ve a graph with 500k nodes. Which is trained in a bipartite setting and learned the embeddings of each node. During inference when a new node comes, I want to connect the node with the neighboring nodes and perform forward pass. My question is, in this case can I just have the new node as seed node and compute GraphSAGE embeddings or do I need to compute all the embeddings for all the nodes in the network to be accurate.
@BarclayII

classicsong · June 13, 2020, 8:32am

You can construct the subgraph with the target node of the new node and do the neighbor sampling to create the subgrapu.

minjie · June 15, 2020, 9:54am

Either way is okay. The first one is less costly but might not be as accurate as the second one. A highly relevant topic is how to make transductive models inductive. There have been proposals on this in the context of getting proper embeddings for new/unseen words (https://arxiv.org/pdf/1707.06556.pdf). Your way of using GraphSAGE to compute an approximate embedding for the new node is definitely a reasonable thought.