Graph classification with Heteregenous graph

I have a dataset of heterogeneous graphs with several node types and edge types.
Not all every graph got every types.
i tried this tutorial: https://docs.dgl.ai/guide/training-graph.html#guide-training-graph-classification
and specifically tried to use the HeteroClassifier described here: https://docs.dgl.ai/guide/training-graph.html#heterogeneous-graph.

in get an exception in this line:
hg = hg + dgl.mean_nodes(g, 'h', ntype=ntype)

The exception is:
RuntimeError: The size of tensor a (50) must match the size of tensor b (64) at non-singleton dimension 0

and after a little debugging I noticed that the size of the output of dgl.mean_nodes is as the number of graphs that had the specific ntype.
is that as expected?
what is your suggestion if not all of my graphs has all of the ntypes?

One possible thing to do is to use zero-tensors for heterographs without a particular node type in readout. For the time being, you can use a placeholder node for each graph without the particular node type.

Thank you for you answer!

Regard your second suggestion - so if i understand correctly, just add a singleton node without connections from the missing ntype?

I didn’t understand you first suggestion if you can please elaborate :pray:

Regard your second suggestion - so if i understand correctly, just add a singleton node without connections from the missing ntype?

Yes, and you can use tensors of 0 for the features of the singleton nodes.

One possible thing to do is to use zero-tensors for heterographs without a particular node type in readout.

That’s a possible change to make on DGL side, so that the output of any built-in readout function wiill always be the total number of graphs in the batch.

Thank you very much i’ll give it a shot!

Hi, I want to ask a simple question:Is heterogeneous graph learning updated according to the type of edge?

RGCN can’t achieve my expected effect. Is there a better heterogeneous graph learning method in DGL?

You may want to try HAN and HGT.

Hi, I have a question about RGCN’s learning way:
dgl.nn.HeteroGraphConv({rel: GraphConv(in_feats, out_feats) for rel in rel_names}, aggregate=‘sum’)

Does it update the nodes according to the incoming order of edge’s type? For example, the A, B, and C edges are passed in. When the nodes of A is updated, the nodes corresponding to B or C are temporarily blocked?

Besides,could you please provide the formula of RGCN?

Say we have incoming edges of both type A and B for a particular node. The model will first aggregate the messages for each edge type in parallel, and then combine the results for different edge types.

It’s just equation 2 in the paper.

It seems that its update mechanism is no different from that of ordinary GCN. I mistakenly thought that he updated the node according to the relationship type. Thank you very much.

I don’t think I made it clear. ABC is the relationship name required by RGCN [https://docs.dgl.ai/guide/training-graph.html#heterogeneous-graph]. It seems that the RGCN updates its corresponding nodes according to the type of edge passed in.

   class RGCN(nn.Module):
        def __init__(self, in_feats, hid_feats, out_feats, rel_names):
               super().__init__() self.conv1 = dgl.nn.HeteroGraphConv({
                    rel: GraphConv(in_feats, out_feats)
                    for rel in rel_names}, aggregate='sum')


model = HeteroClassifier( rel_names=['A', 'C', 'B']).to(device)
model = HeteroClassifier( rel_names=['A', 'B', 'C']).to(device)

Where did you see these two lines of code? I think ABC and ACB are two different relations. The first one is edge type B from node type A to node type C. The second one is edge type C from node type A to node type B. The naming might be a bit confusing though.

In order to make it clearer, I changed my description above.

The order of edge types in rel_names only decides the order of the corresponding GraphConv modules. In updating node representations, the model will first aggregate messages for each edge type in parallel, and then aggregate type-wise results in the end.

So there should be no difference between the results of ABC and ACB? Because when updating, it doesn’t matter the order.

Strictly speaking, you will get different initial parameters if you fix a random seed. Other than this, there are no differences.