Weighted heterogeneous graph in dgl

please i want to create weighted heterogeneous graph in dgl, my dataframe is as follows:
dataframe

the graph I want to get is as follows:
16767300135382_graph
and thank you

Following the api for creating heterogeneous graphs: dgl.heterograph — DGL 1.0.2 documentation, you can create the graph as below:

uids = torch.tensor([1, 1, 2, 2, 3])
iids = torch.tensor([1, 3, 1, 4, 4])
ratings = torch.tensor([4., 2., 3., 5., 4.])
g = dgl.heterograph({("u", "u2i", "i"): (uids, iids)})
g.edata["ratings"] = ratings

In fact, I tried to build a heterogeneous graph with weighted edges from a dataframe data which contains three columns: users - items - ratings.
my code is as follows:

from scipy.sparse import csc_matrix
 
#Let's take the first and second column in order to gather node IDs
nodes = data.iloc[:, 0].tolist() + data.iloc[:, 1].tolist()
 
# Let's sort and remove duplicates (sorting is not mandatory anyways)
nodes = sorted(list(set(nodes))
 
# Let's map each node (string) with a sequential numerical ID to feed the adjacency matrix
 
nodes = [(i,nodes[i]) for i in range(len(nodes))]
 
# Now that string-to-integer mapping is done, let's replace in the original dataframe (data) each string with its corresponding ID
for i in range(len(nodes)):
    data = data.replace(nodes[i][1], nodes[i][0])
 
# create a coordinate-based sparse matrix
M = csc_matrix((data.iloc[:,2], (data.iloc[:,0],data.iloc[:,1])))
 
# create a heterogeneous graph
G = dgl.heterograph({
        ('user', 'rates', 'item') : M.nonzero()})
 
# assign edges by their weights
G.edata["rates"] = click["label"]

but I get the following error:

---------------------------------------------------------------------------
 
DGLError                                  Traceback (most recent call last)
 
<ipython-input-11-638639e41664> in <cell line: 14>()
     12 """G.edges(etype='rates') = click["label"]
     13 G.edges(etype='rated-by') = click["label"]"""
---> 14 G.edata["rates"] = click["label"]
     15
     16 """G.edges['rates'].data['h']=torch.from_numpy(click["label"].to_numpy())
 
1 frames
 
/usr/local/lib/python3.9/dist-packages/dgl/heterograph.py in _set_e_repr(self, etid, edges, data)
   4131             nfeats = F.shape(val)[0]
   4132             if nfeats != num_edges:
-> 4133                 raise DGLError('Expect number of features to match number of edges.'
   4134                                ' Got %d and %d instead.' % (nfeats, num_edges))
   4135             if F.context(val) != self.device:
 
DGLError: Expect number of features to match number of edges. Got 664824 and 621596 instead.

how can i solve this problem?

According to the error message, the number of features and the number of edges are not matched. You can check it using G.num_edges() and len(click["label"]). I guess when you call M.nonzero(), edges with ratings 0 are removed.

1 Like

Thanks a lot for your help

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.