Increase link numbers in a graph

Diego0511 · October 8, 2021, 8:46am

Hello, I’ve trained a link prediction model using dgl, I am wondering how to use this model to increase the link numbers of my graph.

Rhett-Ying · October 8, 2021, 9:44am

Do you want to obtain node embeddings from trained model and link nodes in graph according to embeddings which has high value of dot product?

As mentioned in 5.3 Link Prediction — DGL 0.8 documentation, node embeddings could be obtained. There are multiple ways of using the node embeddings. Examples include training downstream classifiers, or doing nearest neighbor search or maximum inner product search for relevant entity recommendation.

Diego0511 · October 8, 2021, 10:54am

Yep. I’ve already obtained a link prediction model, I use dot product score to minimize the loss. But I am confused how I can actually use this model to increase the link numbers in my graph

Rhett-Ying · October 9, 2021, 1:26am

then did you try to obtain node embedding like the link I showed: node_embeddings = model.sage(graph, node_features), then link nodes which has high dot product on embedded feature? is this what you want: increase link numbers in graph? or could you explain more on increase link numbers?

Diego0511 · October 9, 2021, 5:30am

So basically I have a link prediction model and want to use it to predict if there will be links between those unconnected nodes. But I don’t know how to apply my model to iterate those unconnected nodes, do I need a dataloader or something?

Rhett-Ying · October 9, 2021, 5:46am

once obtain the node embeddings, go through all node pairs directly and compute dot products, then link them if the value is greater than a threshold. there exist N*(N-1)/2 pairs.

Diego0511 · October 9, 2021, 5:48am

Yep, that’s my problem. I am confused how to go through all node pairs, could you please give me an example showing how to do this?

Rhett-Ying · October 9, 2021, 6:11am

below one is not efficient.
node_embeddings = model.sage(graph, node_features) N = graph.num_nodes() u=[], v=[] for i in range(N): for j in range(i+1, N): value = dot_product(node_embeddings[i], node_embeddings[j) if value > threshold: u.append(i) v.append(j) graph.add_edges(u,v)

Diego0511 · October 9, 2021, 6:38am

Cool, I got the idea. I am still wondering if there is a efficient way to do this cause my graph is pretty huge. Btw, I followed Stochastic Training of GNNs tutorial to train my model, how to obtain node embeddings in this case?

Rhett-Ying · October 9, 2021, 6:50am

model.gcn(graph, node_features)?

Rhett-Ying · October 9, 2021, 6:56am

as for the efficiency on large graph, how about adding edges first, then calling graph.apply_edges(dgl.function.u_dot_v('x', 'x', 'score')) to compute dot products on edges. then remove edges which has low dot product. as N(num_nodes) is very large, maybe sample some of them to measure whether link exists is an option, not go through all possible pairs~N^2)?

Diego0511 · October 9, 2021, 8:34am

emmm…yep, that sounds compromising, but I am worrying about the accuracy if we do this way

system · November 8, 2021, 8:35am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.