Heterogeneous graph nodes/edges features

monk · August 24, 2020, 4:24am

About Heterogeneous graph!

From the document of DGL, there is an assumption, nodes in the same type have no different features and share the same feature vector, right? What should I do, if my nodes in one type have different feature, these nodes in one type need different feature vector to distinguish each other.

BarclayII · August 24, 2020, 5:55am

Nodes of the same type can have different features; the same feature name only need to have the same shape. For instance, here is how you would assign a feature named x for a node type user:

g.nodes['user'].data['x'] = torch.randn(5, 4)

Here the first dimension of the assigned tensor represents different nodes. So you essentially assign each node a different 4-dimensional vector in this case.

Which particular document are you referring to? Could you give us the link? Maybe the content there is problematic.

monk · August 24, 2020, 6:32am

https://docs.dgl.ai/guide/graph-heterogeneous.html#working-with-multiple-types

I generate a Heterogeneous graph by this link.
Thanks for your example! Maybe I make something wrong.

Now, I have two problems:

My graph generate code is :
data_dict = {
('user', 'follows', 'user'): (torch.tensor([0, 0, 0, 0]), torch.tensor([0, 0, 0, 0])),
('user', 'follows', 'topic'): (torch.tensor([0, 0, 0, 0]), torch.tensor([1, 2, 1, 1])),
('user', 'plays', 'game'): (torch.tensor([0, 0, 0, 0]), torch.tensor([3, 4, 1, 1]))
}
hg = dgl.heterograph(data_dict)
after I run
hg.nodes['user'].data['x'] = torch.randn(5, 4)
the interpreter gives an error:
dgl._ffi.base.DGLError: Expect number of features to match number of nodes (len(u)). Got 5 and 1 instead.

But if I run
hg.nodes['user'].data['x'] = torch.randn(1, 4)
there is no errors.
I don’t know how to fix it.

I really want to get an example of generating a heterogeneous graph from disk file. I read the example of generating a homogeneous graph from disk file https://github.com/dglai/WWW20-Hands-on-Tutorial/blob/master/basic_tasks/1_load_data.ipynb. But I am still confused.

BarclayII · August 24, 2020, 6:48am

Re 1: Since all IDs appeared for user node type are 0, DGL can only infer that there is only one user node. So torch.randn(5, 4) fails but torch.randn(1, 4) succeeds. To specify the number of nodes manually, you need to specify the num_nodes_dict argument.

Re 2: Building a heterogeneous graph from disk files usually simply involves loading the connections for each edge type from the disk. An example would be loading user-follows-user, user-follows-topic and user-plays-game from three separate CSV files and construct the data_dict above.

monk · August 24, 2020, 6:59am

Re 1. : it work! Thanks a lot!

Re 2. : I think I know how to use it

ysgncss · September 28, 2020, 8:57am

Must nodes be sorted when constructing heterogeneous graphs? If not sorted, the number of nodes in the graph is the maximum ID.

ysgncss · September 28, 2020, 9:02am

if If three nodes in the graph are numbered 0,1,2 and use hg.data[‘x’] = torch.randn(3, 50) to generate features. Are the features of each node numbered sequentially?

mufeili · September 29, 2020, 7:19am

The node IDs should be consecutive integers starting from 0. If you do not explicitly specify the number of nodes by construction, it will be simply max node ID + 1.
hg.data[‘x’][i] will give the feature 'x' of the node with ID i.

ysgncss · September 29, 2020, 8:16am

I have some pre-labeled nodes. Should I reorder them? Such as nodes 2, 5, 14, edges:2-5, 5-14 should be re labeled as nodes:0, 1, 2, edges: 0-1,1-2?

mufeili · September 29, 2020, 8:17am

If you only have 3 nodes rather than 14 nodes, then yes.

ysgncss · September 30, 2020, 2:07am

Thanks. . I’m trying to label edges in a heterogeneous graph. The following code doesn’t work. What should I do？

                g.edges['A'].data['x'] = torch.tensor(w_w_ids)

mufeili · September 30, 2020, 6:24pm

What’s the error message? Can you provide a code snippet for reproducing the issue?