Heterogeneous graph link prediction, predict the score

Hi,
I have a dataset that predicts there is a link between node A and node B but my input is not these two nodes, coz these two nodes dependency with 3 nodes then inputs will be 5 nodes that should be the matrix and predict the link between nodes based on the score of the dataset then score determine there is a link or not then the output of GNN is a score.
so I want to know:

  1. its possible my input is the matrix of these 5 nodes
    2.link prediction of the heterogeneous graph has this ability to predict the score(my y is score and train to predict the score for link)
    Thanks

its possible my input is the matrix of these 5 nodes

It’s possible as message passing not only requires the input features of node A and B, but also their k-hop neighbors.

2.link prediction of the heterogeneous graph has this ability to predict the score(my y is score and train to predict the score for link)

I think so, as long as the graph is a heterogeneous graph.

Thank you for your reply.
I have two heterogeneous graphs one is my dataset that I load, and the second use for finding hidden relation, so for that I read 6.3 Training GNN for Link Prediction with Neighborhood Sampling — DGL 0.6.1 documentation then the second network assign here as hidden_features or use another model for predicting, and my other question is at this time nodes of heterograph have not the feature so how to randomly assign the features for them.
sorry its my first experience in machine learning, so maybe ask simple things.
Thanks

If you do not have input node features, see 14 here.

1 Like

Thanks for your help,
it was helpful, in link prediction I don’t need the classify my data in 6.3 Training GNN for Link Prediction with Neighborhood Sampling — DGL 0.6.1 documentation we have num_classes if don’t want to use it what should I do?

In that case, your task is probably more like edge classification and you may want to check 6.2 Training GNN for Edge Classification with Neighborhood Sampling.

1 Like

Thanks for the help, I have other questions about the dimensions of features the input and hidden features have different dimensions so
1.here is important to set the nodes feature or edge features when the goal is link predition
2.I set them like this is right or not, or with another way assign them
3. here input and hidden and output features means input and hidden and output layer size?
4. when I run code, show this error:
TypeError: new() received an invalid combination of arguments - got (dict, dict), but expected one of:

  • (*, torch.device device)
    didn’t match because some of the arguments have invalid types: (!dict!, !dict!)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, *, torch.device device)
  • (object data, *, torch.device device

And the code:

class StochasticTwoLayerRGCN(nn.Module):
def init(self, in_feat, hidden_feat, out_feat, rel_names):
super().init()
self.conv1 = dgl.nn.HeteroGraphConv({
rel : dgl.nn.GraphConv(in_feat, hidden_feat, norm=‘right’)
for rel in rel_names
})
self.conv2 = dgl.nn.HeteroGraphConv({
rel : dgl.nn.GraphConv(hidden_feat, out_feat, norm=‘right’)
for rel in rel_names
})

def forward(self, blocks, x):
    x = self.conv1(blocks[0], x)
    x = self.conv2(blocks[1], x)
    return x

class ScorePredictor(nn.Module):
def forward(self, edge_subgraph, x):
with edge_subgraph.local_scope():
edge_subgraph.ndata[‘x’] = x
for etype in edge_subgraph.canonical_etypes:
edge_subgraph.apply_edges(
dgl.function.u_dot_v(‘x’, ‘x’, ‘score’), etype=etype)
return edge_subgraph.edata[‘score’]

class Model(nn.Module):
def init(self, in_features, hidden_features, out_features,etypes):
super().init()
self.rgcn = StochasticTwoLayerRGCN(
in_features, hidden_features, out_features, etypes)
self.pred = ScorePredictor()

def forward(self, positive_graph, negative_graph, blocks, x):
    x = self.rgcn(blocks, x)
    pos_score = self.pred(positive_graph, x)
    neg_score = self.pred(negative_graph, x)
    return pos_score, neg_score

model = Model(in_features={‘user_a’:763372, ‘user_a_con’:6,‘user_a_cell’:800060
,‘user_b’:763372,‘user_b_cell’:800060,‘user_b_con’:5}
, hidden_features={‘user’:763372,‘target’:168,‘protein’:17173,
‘meet’:62246,‘gene’:167,‘cell’:800060,
‘con’:6}, out_features=256,etypes= load_data.canonical_etypes)
model = model.cuda()
opt = torch.optim.Adam(model.parameters())

for input_nodes, positive_graph, negative_graph, blocks in dataloader:
blocks = [b.to(torch.device(‘cuda’)) for b in blocks]
positive_graph = positive_graph.to(torch.device(‘cuda’))
negative_graph = negative_graph.to(torch.device(‘cuda’))
input_features = blocks[0].srcdata[‘features’]
pos_score, neg_score = model(positive_graph, negative_graph, blocks, input_features)
loss = compute_loss(pos_score, neg_score)
opt.zero_grad()
loss.backward()
opt.step()

here input and hidden and output features means input and hidden and output layer size?

Correct.

  1. when I run code, show this error:
    TypeError: new() received an invalid combination of arguments - got (dict, dict), but expected one of:
  • (*, torch.device device)
    didn’t match because some of the arguments have invalid types: (!dict!, !dict!)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, *, torch.device device)
  • (object data, *, torch.device device

Have you identified the line of code at which the error occurred?

Now I realize what was the my problem,this line:
self.rgcn = StochasticTwoLayerRGCN(in_features, hidden_features, etypes)

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.