Is there a way to use ndata in a DGLHeteroGraph with multiple node types?

otaviocx · July 19, 2020, 3:02pm

Hi guys,

I’m trying to create a GNN model for node classification in a heterogeneous graph with 2 types of node and 2 types of edges:

G = dgl.heterograph({
    ('company', 'has_partner', 'company'): [],
    ('company', 'has_partner', 'person'): [],
    ('company', 'is_partner_of', 'company'): [],
    ('person', 'is_partner_of', 'company'): []
})

is_partner_of is a reverse relationship for has_partner to get bidirectional relations (edges).

I already have created the graph with all the edges using a code similar to the one above. Now, I need to add the features for each node to be used during the training process. Usually, when I have only one type of node, it is possible to use the ndata field to deal with node features. How can I do that in such case (which I have more than one node type)?

I want to classify only the company nodes using a semi-supervised approach (I have the labels for some companies and need to predict the labels for the others). It is a task similar to the one described in this tutorial: https://docs.dgl.ai/en/0.4.x/tutorials/basics/5_hetero.html - but, in my case, I need to set the features of the nodes with pre-defined values instead of generating random vectors.

Thanks in advance!

mufeili · July 19, 2020, 3:35pm

See if the example below helps.

import dgl
import torch

G = dgl.heterograph({
    ('company', 'has_partner', 'company'): (torch.tensor([1, 2]), torch.tensor([2, 3])),
    ('company', 'has_partner', 'person'): (torch.tensor([2, 2]), torch.tensor([1, 2])),
})
# 4 nodes of type 'company'
G.nodes['company'].data['h'] = torch.randn(4, 1)
# 3 nodes of type 'person'. Nodes of different types have separate features.
G.nodes['person'].data['h'] = torch.randn(3, 2)

otaviocx · July 19, 2020, 7:23pm

Thanks @mufeili, I will try that!