Understanding heterograph and its use with SAGEGraph and ConvGraph

Under HeteteroGraph conv, it shows that you can add both GrapgConv and SAGEConv into the HeteroGraphConv. (HeteroGraphConv — DGL 0.8.1 documentation)

If I run GraphConv on a heterogeneous network with different feature sizes for different nodes, GraphConv doesnt complain about dimensionality, however SAGEConv does.


import dgl
import dgl.nn as dglnn
import matplotlib.pyplot as plt
import networkx as nx
import torch as th

graph_data = {
   ('drug', 'interacts', 'drug'): (th.tensor([0, 1]), th.tensor([1, 2])),
   ('drug', 'interacts', 'gene'): (th.tensor([0, 1]), th.tensor([2, 3])),
   ('drug', 'treats', 'disease'): (th.tensor([1]), th.tensor([2]))
g = dgl.heterograph(graph_data)

g.nodes['drug'].data['hv'] = th.ones(3, 20)
g.nodes['disease'].data['hv'] = th.ones(3, 10)
g.nodes['gene'].data['hv'] = th.ones(4, 5)

node_features = {'drug': g.nodes['drug'].data['hv'], 'disease': g.nodes['disease'].data['hv'], 'gene': g.nodes['gene'].data['hv']}

conv = dglnn.HeteroGraphConv({
    'interacts' : dglnn.GraphConv(20,10,'both'),
    'treats' : dglnn.GraphConv(20,10,'both'),

If I substitute GraphConv with SAGEConv it breaks with the classic issue of matrices cannot be multiplied. What I dont understand is what is so different in GraphConv that it works with different sizes. Can anyone in a bit simpler terms explain how it merges two matrices of different sizes?

Lastly, Is it correctly understood that in this example each data will be aggregated over each edge type individually, and then in the end use the aggregation function, for example “mean”.
If so, would it then not create quite a mess if you have features on very different scales (age vs height in mm vs gps coordinates). And all would then be squeezed together?

GraphConv does not merge two matrices of different sizes; rather, it only gathers messages from its neighbors and replaces the center node’s representation. Since the neighbors only have dimension 20 there isn’t any problem. SAGEConv on the other hand needs to combine the neighbor representation and the center node representation together, and it assumes that both representations have the same dimension, hence the error. This reminds me that SAGEConv probably needs to have two dimensions, one for the neighbors and the other for the center node itself…

This is true in general for any kind of neural network (e.g. if you feed in salary and age as it is into an MLP). Normally one should normalize the features so that they fall in roughly the same scale.

Thanks. That clarified a lot on why I got the results I got.

One last question. If the features are embeddings in two different vectorspaces spaces (but the same size ). Would it still be advisable to aggregate them together, or would it be better to create a separate aggregation for each type, and concatenate them in the end as the embedding for each node?

The goal would be link prediction between two different nodes.

I think both options are viable and I have seen both approaches in real practices. I guess it will depend on the actual data.

I just confirmed that SAGEConv does support different-sized features. You could declare SAGEConv like this:

conv = dglnn.HeteroGraphConv({
    'interacts' : dglnn.GraphConv((20, 10),10,'both'),  # 10 comes from the feature size in target node type
    'treats' : dglnn.GraphConv((20, 5),10,'both'),
1 Like

Ah I had missed that it was possible to put it in parentheses. Thanks