Autoencoder for Heterogeneous Graphs


If I am not mistaken, a Graph Autoencoder (GAE) uses Inner Product to reconstruct the adjacency matrix of a homogenous graph. I want to build a GAE for heterogeneous graphs, however, they have multiple adjacency matrices for each canonical edge type with unsymmetrical dimensions. How can I reconstruct a heterogeneous graph? Is it even possible?

The easiest way to decode multiple adjacency matrices is to learn a relation-specific bilinear weight for each edge type. Basically you will change the dot product X^\top X in the original formulation to X^\top W Y, where X is the source node type’s latent features, Y is the destination node type’s latent features, and W is a learnable parameter.

1 Like

Thank you!

I want to use HeteroGraphConv for the encoder which aggregates node features from multiple relations. Can the decoder reconstruct the adjacency matrices for each canonical edge type even though the latent features contain features from node types that are not part of the “current” edge type?

Usually this is fine if you don’t want to learn an edge-type-specific latent feature for every node.

1 Like

Thanks for the answer!

HeteroGraphConv returns only latent features for all destination node types, but the decoder also requires latent features for source node types. How can I get a latent feature for node types that are only sources but not destinations of relations?

You could add reverse edge types. In this case every node type will have an output from HeteroGraphConv.

But in my case, the direction of relations matters so I don’t think I can do that.

I mean, for each directed edge type etype with source-destination tensors (src, dst), you can add a reverse edge type (say, rev-etype) with source destination tensors swapped (i.e. (dst, src)). That way, the direction information is still preserved because the reverse edges are in another edge type.

1 Like

Thanks for the clarification! Is it sufficient if I only add a reverse edge type for edges whose source node is never a destination node?

From a modeling perspective it should be beneficial to add a reverse edge type anyways because you would usually want to allow message flow on the reverse direction.

1 Like

The reconstruction loss during training does not change at all. Do I have to perform some kind of normalization? How does that work for heterogenous graphs?

Usually it may be because of implementation bugs such as dimensionality mismatch. I would suggest you

  • Step into your code and see if the shapes of every intermediate variables are working as intended (pudb, PyCharm, VSCode etc. all work for this purpose).
  • Try overfitting your model on a small example (e.g. a smaller graph or a single batch) and see if the loss decreases. If not then there is a bug in your implementation either in your code or mathematically.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.