Questions related with link prediction

Maulpy · December 27, 2020, 2:55pm

Dear All members of DGL community.

This post is the continuation of this. My current goal is to get a solid code of link prediction, and getting a result of many type of lost (Cross-entropy, BPR, Margin, etc).

First of all, each nodes features data is loaded from different tensor file in my own computer.

What i want to ask is the following. is my code make sense after all?

2.1 If i could define simple function working at each edge for message passing, will the ‘complexity’ of the said function evolved as the computation goes on?
2.2. Severed link that happen in the graph evolution (e.g : if the nodes occur in between), For starter i did with only 9 nodes, and it took many times to define it manually, is there any automated system in DGL that is compatible with this?
2.3. How to build a (maybe dictionary) of nodes and its type to be more compact and callable, in heterogenous graph guide the making is shown to be manual ways.

import torch
import pandas as pd
import dgl

#For this case i modified my base data to the border of the layout in Area X, in the 2 conditions (all in border):
#Adjacent, but not immediate :
#14-Nov : (Area-A1, Area-A2)
#19-Nov : (Area-B1, Area-B2)
#21-Nov : (Area-C1, Area-C2, Area-C3, Area-C4)
#23-Nov : (Area-D5)

#Adjacent, and relatively immediate :
#18-Dec : (Area-E1, Area-E2, Area-E3, Area-E4, Area-E5, Area-E6)
#20-Dec : (Area-F1, Area-F2, Area-F3, Area-F4)

#FOR TYPE 1, read data from excel
print(“Loading xlsx…”)
AreaA1 = pd.read_excel(‘C:/Users/Acer/DGLCONDA05/Data Type 1/Area-A1.xlsx’)
AreaA2 = pd.read_excel(‘C:/Users/Acer/DGLCONDA05/Data Type 1/Area-A2.xlsx’)
…
…

#Convert into tensor form
print(“Converting to Tensor…”)
Area_A1 = torch.tensor(AreaA1.values, dtype=torch.float32)
Area_A2 = torch.tensor(AreaA2.values, dtype=torch.float32)
…
…
#save intor tensor file
torch.save(Area_A1, ‘C:/Users/Acer/DGLCONDA05/Data Type 1/Area-A1.pt’)
torch.save(Area_A2, ‘C:/Users/Acer/DGLCONDA05/FDC Data Type 1/Area-A2.pt’)

#define graph type 1
graph_data_type1 = {

(‘Area_A1’, ‘0dx-2y0’, ‘Area_A2’): (torch.tensor([0]), torch.tensor([1])),
(‘Area_A2’, ‘5dx1y1’, ‘Area_B1’): (torch.tensor([1]), torch.tensor([2])),
…
}

#tensor for edges features data, including temporal-spatial.
G.edges[‘0dx-2y0’].data[‘linkdata’] = torch.tensor([0,-2,0])
G.edges[‘5dx1y1’].data[‘linkdata’] = torch.tensor([5,1,1])
…
…

#set node data from defined tensor loaded from tensor file.
G = dgl.DGLHeterograph(graph_data_type1)
G.ndata[‘gabungan’][0] = Area_A1
G.ndata[‘gabungan’][1] = Area_A2
…
…


#For message passing.
    class SAGE(nn.Module):

def __init__(self, in_feats, hid_feats, out_feats):
    super().__init__()

    self.conv1 = dglnn.SAGEConv(
        in_feats=in_feats, out_feats=hid_feats, aggregator_type='mean')
    self.conv2 = dglnn.SAGEConv(
        in_feats=hid_feats, out_feats=out_feats, aggregator_type='mean')

def forward(self, graph, inputs):
    # inputs are features of nodes
    h = self.conv1(G, inputs)
    h = F.relu(h)
    h = self.conv2(G, h)
    return h

def __init__(self, in_features, hidden_features, out_features, rel_names):
    super().__init__()

self.sage = RGCN(in_features, hidden_features, out_features, rel_names)
self.pred = HeteroDotProductPredictor()
def forward(self, G, neg_g, j, etype):
    h = self.sage(G, j)
    return self.pred(G, h, etype), self.pred(neg_g, h, etype)

def compute_loss(pos_score, neg_score):
    # Margin loss
    n_edges = pos_score.shape[0]
    return (1 - neg_score.view(n_edges, -1) + pos_score.unsqueeze(1)).clamp(min=0).mean()
    
    k = 3 #i hope this is reasonable
    model = Model(5, 5, 5, G.etypes) #don't know how to adjust it, is it reasonable???
    #'feats' means feature size, i will replace user and item to source and destination
    #WTF is user and item stand for? just play along and change into source and destination, still nonsense though.
    source_feats = G.nodes[:].data['linkdata']
    destination_feats = G.nodes[:].data['linkdata']
    node_features = {'user': user_feats, 'item': item_feats}
    opt = torch.optim.Adam(model.parameters())
    #https://docs.dgl.ai/en/0.4.x/generated/dgl.DGLGraph.edges.html, ":" means all right?
    for epoch in range(10):
        negative_graph = construct_negative_graph(G, k, ('source', : , 'destination'))
        pos_score, neg_score = model(hetero_graph, negative_graph, node_features, ('source', : , 'destination'))
        loss = compute_loss(pos_score, neg_score)
        opt.zero_grad()
        loss.backward()
        opt.step()
        print(loss.item())
```

mufeili · December 28, 2020, 4:51am

2.1 If i could define simple function working at each edge for message passing, will the ‘complexity’ of the said function evolved as the computation goes on?

For most sparse implementations of message passing neural networks, the computation is proportional to the number of edges.

2.2. Severed link that happen in the graph evolution (e.g : if the nodes occur in between), For starter i did with only 9 nodes, and it took many times to define it manually, is there any automated system in DGL that is compatible with this?

I don’t understand. Can you share a code snippet?

2.3. How to build a (maybe dictionary) of nodes and its type to be more compact and callable, in heterogenous graph guide the making is shown to be manual ways.

I don’t understand. What’s the purpose of this?

system · January 27, 2021, 4:51am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.