For link prediction inference, how can I score every pair of nodes from new unseen graphs with no positive edges?

ogggcar · September 13, 2021, 10:13am

Thank you so much.

One problem. Now when instantiating the model I get the following error in the Heterodotproduct:

TypeError: __init__() missing 1 required positional argument: 'out_feats'

at:

self.pred = HeteroDotProductPredictor()

I have tried the following:

class Model(nn.Module):
    def __init__(self, in_features, hidden_features, out_features, rel_names):
        super().__init__()
        self.sage = RGCN(in_features, hidden_features, out_features, rel_names)
        self.pred = HeteroDotProductPredictor(out_feats=out_features)
    def forward(self, g, neg_g, x, etype):
        h = self.sage(g, x)
        return self.pred(g, h, etype), self.pred(neg_g, h, etype)

But I get:

AttributeError: cannot assign module before Module.__init__() call

How should I adapt my model to this?

And please, one last thing I dont understand yet. In the end, how can I use this feats and the new score predictor when computing a pair of nodes similarity for a given edge? I guess all I need now is to extract those link specific embeddings, but how could I do it? I undertand that now every node has 3 different vectors, right? Or how this work this new out_feats?

mufeili · September 14, 2021, 4:09am

Sorry, you need to add super().__init__() in MLP and HeteroDotProductPredictor.
As you said, you now have 3 different embeddings per node corresponding to 3 edge types. You just need to take the dot product of pairs of node embeddings per edge type.

ogggcar · September 14, 2021, 6:57am

Not sorry, please. Thank you so much!

I still get an error:

KeyError: ('ent', 'link1', 'ent')

in:

 graph.ndata['h'] = self.etype_project[etype](h)

Any idea why?

But how can I extract those edge specific feats?

mufeili · September 15, 2021, 5:45am

I still get an error:

KeyError: ('ent', 'link1', 'ent')

in:

graph.ndata['h'] = self.etype_project[etype](h)

Any idea why?

You can change 'linki' to ('ent', 'linki', 'ent') in self.etype_project.

But how can I extract those edge specific feats?

As I said, you have 3 MLPs, one per edge type. After you project the node representations, you have 3 types of node representations, corresponding to 3 edge types. When you take the dot product of the representation of node i and j corresponding to edge type k, you get the possibility score of having an edge between i and j of type k.

ogggcar · September 15, 2021, 7:22am

Hi. I got en error:

TypeError: linear(): argument 'input' (position 1) must be Tensor, not dict

but I solved it changing

graph.ndata['h'] = self.etype_project[etype](h)

for

graph.ndata['h'] = self.etype_project[etype](h['ent'])

Now it works perfect. I did the same with my old predictor, since my graph it’s a heterograph with just one node type. It makes sense, right? Got it from this post: Edge Classification with one node type - #5 by BarclayII

Sorry but don’t know yet how to get this edge specific and updated embeddings for a given node from an unseen graph. Remember that this new graphs don’t contain any link1 and link3, so no positive edges as during training, so I cannot compare with the negative ones. In conclusion, I would need to:

pass this new unseen graph with no link1 and link3 though the model to update its embeddings
then extract the new edge specific embeddings.

Any code help, please? I dont really know how to do it. If I could do something like this

updated_features = model(graph, etype)

everything would be solved, and the I could easily calculate the similarity of every pair of model and edge updated embeddings.

Also and sorry for this too much text, just realized, quite stupid from me: this dot product similarity has a way to solve unidirectionality? In my case, link1 must be directional and exclusive, so if there is a link1 between u and v then it cannot exist a link1 between v and u, but the dot product will be the same in both directions, right? For link1 the score of u,v must be different than for v,u. For the rest of links it doesn’t matter. Can I solve this?

mufeili · September 16, 2021, 5:42am

Now it works perfect. I did the same with my old predictor, since my graph it’s a heterograph with just one node type. It makes sense, right? Got it from this post: Edge Classification with one node type - #5 by BarclayII

That sounds fine.

Any code help, please?

The code snippet you have now will also work for this case. All you need is to update node representations with a GNN and then score pairs of nodes for each edge type. It doesn’t matter whether you have links only for link1, link2, or link3 during the test time.

this dot product similarity has a way to solve unidirectionality? In my case, link1 must be directional and exclusive, so if there is a link1 between u and v then it cannot exist a link1 between v and u, but the dot product will be the same in both directions, right? For link1 the score of u,v must be different than for v,u. For the rest of links it doesn’t matter. Can I solve this?

You will then need a different score function. You cannot distinguish the edge direction by dot product. Perhaps @BarclayII can provide a suggestion on score function.

ogggcar · September 16, 2021, 6:01am

Thanks again, @mufeili

But which is the specific code snippet to update node representations of new unseen graphs with the GNN model?

But before this what is the code to extract the edge specific embeddings that I need to calculate the edge especific similarity between two nodes? Right now I just know: feats_node_1 = g.ndata['feats'][1], which isn`t edge specific.

It’s the code part of all this what I’m missing, but I understand the process.

mufeili · September 17, 2021, 3:41am

But which is the specific code snippet to update node representations of new unseen graphs with the GNN model?

h = self.sage(g, x)

But before this what is the code to extract the edge specific embeddings that I need to calculate the edge especific similarity between two nodes? Right now I just know: feats_node_1 = g.ndata['feats'][1] , which isn`t edge specific.

self.etype_project[etype](h['ent'])

is edge-type specific.

ogggcar · September 18, 2021, 4:47pm

But this “self” code must be used inside a function inside my model class, right? It is not the code I should use directly with a new graph, or is it? I mean, I cannot use h = self.sage(g, x) in, lets say, a new Colab cell. Am i wrong?

mufeili · September 19, 2021, 9:50am

Assume you have a trained model that you want to apply, you will need to save the learned model parameters, which can be used later. See this PyTorch tutorial.

ogggcar · September 19, 2021, 3:12pm

Yes, I know this. What I mean is that even though it is a just trained model or a saved one, the self.something statement cannot be used outside the class definition, right?

I mean, lets say I have a graph ‘g’ like the ones I told you. If a wanted to pass it through the sage method, shouldn’t I do something like model.sage(g, x) instead of self.sage(g, x), which I think is only used when creating the class, right? Same with self.etypeproject.

Maybe Im wrong, sorry in that case.

I mean, how can I, once I have already trained, saved and loaded my model, apply it and the two previous snippets (or something similar) to a new graph? Its the ‘self’ part what confuses me when I try to use it directly with a graph.

Thanks again.

mufeili · September 20, 2021, 7:13am

Yes, I know this. What I mean is that even though it is a just trained model or a saved one, the self.something statement cannot be used outside the class definition, right?

No, you can do

model = Model(...)
model.sage(...)

ogggcar · September 20, 2021, 7:59pm

Perfect. Thanks!!

Same with model.etype_project(…), right?

And I dont need to add the self.sage part again or anything else in the class definion, right?

mufeili · September 21, 2021, 5:56am

For etype_project, use model.pred.etype_project.

And I dont need to add the self.sage part again or anything else in the class definion, right?

No.

ogggcar · September 21, 2021, 6:36am

Couple of last things:

In model.sage(g, x), ‘x’ is the node feats data, right? I mean, the list of initial embeddings vectors of every node from the graph?

Related to this: how can I access to a given node in this etype_project case? something like model.pred.etype_project[etype] and…? The h[‘ent’] part from before looks quite general. Could i solve it by indexing? Lets say, how could I get the link1 embeddings of the node of index 2 from a given graph G?

Thanks!!!

mufeili · September 22, 2021, 5:31am

In model.sage(g, x), ‘x’ is the node feats data, right? I mean, the list of initial embeddings vectors of every node from the graph?

How did you define RGCN?

Related to this: how can I access to a given node in this etype_project case? something like model.pred.etype_project[etype] and…? The h[‘ent’] part from before looks quite general. Could i solve it by indexing? Lets say, how could I get the link1 embeddings of the node of index 2 from a given graph G?

You can index the output of etype_project[etype](h). etype_project[etype](h)[i] gives the etype embeddings of the i-th node.

ogggcar · September 22, 2021, 9:33am

RGCN:




class RGCN(nn.Module):
    def __init__(self, in_feats, hid_feats, out_feats, rel_names):
        super().__init__()

        self.conv1 = dglnn.HeteroGraphConv({
            rel: dglnn.GraphConv(in_feats, hid_feats)
            for rel in rel_names}, aggregate='sum')
        self.conv2 = dglnn.HeteroGraphConv({
            rel: dglnn.GraphConv(hid_feats, out_feats)
            for rel in rel_names}, aggregate='sum')

    def forward(self, graph, inputs):
        # inputs are features of nodes
        h = self.conv1(graph, inputs)
        h = {k: F.relu(v) for k, v in h.items()}
        h = self.conv2(graph, h)
        return h

and model:

class Model(nn.Module):
    def __init__(self, in_features, hidden_features, out_features, rel_names):
        super().__init__()
        self.sage = RGCN(in_features, hidden_features, out_features, rel_names)
        self.pred = HeteroDotProductPredictor(out_feats=out_features)
    def forward(self, g, neg_g, x, etype):
        h = self.sage(g, x)
        return self.pred(g, h, etype)

mufeili · September 23, 2021, 2:20am

See the doc on HeteroGraphConv.

ogggcar · September 23, 2021, 9:58am

Mmm sorry but I dont understand. Does that mean my implentation is wrong? It follows the guidelines but i understand we have updated some things.

In the doc you sent me by calling the conv model the forward function takes G and some node type feats and updtaes them. But, since Im using my Model class, and not just my RGCN one like in this doc, and my model forward function uses both G and neg_G, in a new graph Im calling just sage, and not forward, right? So it should take G and the initial feats of a given node type. Since I have just one, I guess h = model.sage(G, ent_feats) should be ok.

Am I wrong?

mufeili · September 24, 2021, 5:22am

The update of node representations with self.sage(g, x) should be correct. However, you did not use neg_g at all in forward.