please i want to create weighted heterogeneous graph in dgl, my dataframe is as follows:
the graph I want to get is as follows:
and thank you
please i want to create weighted heterogeneous graph in dgl, my dataframe is as follows:
the graph I want to get is as follows:
and thank you
Following the api for creating heterogeneous graphs: dgl.heterograph — DGL 1.0.2 documentation, you can create the graph as below:
uids = torch.tensor([1, 1, 2, 2, 3])
iids = torch.tensor([1, 3, 1, 4, 4])
ratings = torch.tensor([4., 2., 3., 5., 4.])
g = dgl.heterograph({("u", "u2i", "i"): (uids, iids)})
g.edata["ratings"] = ratings
In fact, I tried to build a heterogeneous graph with weighted edges from a dataframe data which contains three columns: users - items - ratings.
my code is as follows:
from scipy.sparse import csc_matrix
#Let's take the first and second column in order to gather node IDs
nodes = data.iloc[:, 0].tolist() + data.iloc[:, 1].tolist()
# Let's sort and remove duplicates (sorting is not mandatory anyways)
nodes = sorted(list(set(nodes))
# Let's map each node (string) with a sequential numerical ID to feed the adjacency matrix
nodes = [(i,nodes[i]) for i in range(len(nodes))]
# Now that string-to-integer mapping is done, let's replace in the original dataframe (data) each string with its corresponding ID
for i in range(len(nodes)):
data = data.replace(nodes[i][1], nodes[i][0])
# create a coordinate-based sparse matrix
M = csc_matrix((data.iloc[:,2], (data.iloc[:,0],data.iloc[:,1])))
# create a heterogeneous graph
G = dgl.heterograph({
('user', 'rates', 'item') : M.nonzero()})
# assign edges by their weights
G.edata["rates"] = click["label"]
but I get the following error:
---------------------------------------------------------------------------
DGLError Traceback (most recent call last)
<ipython-input-11-638639e41664> in <cell line: 14>()
12 """G.edges(etype='rates') = click["label"]
13 G.edges(etype='rated-by') = click["label"]"""
---> 14 G.edata["rates"] = click["label"]
15
16 """G.edges['rates'].data['h']=torch.from_numpy(click["label"].to_numpy())
1 frames
/usr/local/lib/python3.9/dist-packages/dgl/heterograph.py in _set_e_repr(self, etid, edges, data)
4131 nfeats = F.shape(val)[0]
4132 if nfeats != num_edges:
-> 4133 raise DGLError('Expect number of features to match number of edges.'
4134 ' Got %d and %d instead.' % (nfeats, num_edges))
4135 if F.context(val) != self.device:
DGLError: Expect number of features to match number of edges. Got 664824 and 621596 instead.
how can i solve this problem?
According to the error message, the number of features and the number of edges are not matched. You can check it using G.num_edges()
and len(click["label"])
. I guess when you call M.nonzero()
, edges with ratings 0
are removed.
Thanks a lot for your help
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.