GraphSAGE for heterogeneous graphs

navmarri · February 11, 2020, 4:24pm

I want to find similar implementation of original unsupervised graphsage for heterogeneous graphs where each node might have different initial feature vector size. Then project to common space and perform regular unsupervised graphsage
Is there an existing implementation for this?

VoVAllen · February 13, 2020, 9:39am

Probably PinSAGE is what you want. https://arxiv.org/pdf/1806.01973.pdf

However DGL doesn’t have a mature implementation for now. @BarclayII is working on the related API (https://github.com/dmlc/dgl/pull/1249). We hope we could bring this out soon.

navmarri · February 13, 2020, 7:29pm

Is there any ETA for this feature?

minjie · February 17, 2020, 7:42am

The core features will be done at the end of Feb. Model development will start next month.

minjie · February 17, 2020, 8:26am

BTW, the original GraphSAGE only works for homograph, so @navmarri if you could point us to some published papers about using GraphSAGE on heterograph, we could try include that in our next release.

navmarri · February 21, 2020, 3:19am

There is no paper per se, I came across the blog from uber for multimodal data

navmarri · February 21, 2020, 9:21pm

To summarize the block, they just add an extra projection layer for nodes with different feature size to a common space. Is it possible to do that in dgl? just adding a projection layer for few nodes and excluding others?

minjie · March 2, 2020, 9:51am

It’s very doable. The idea is when we create the graph, we label user nodes from 0 to #users - 1, while item nodes from #users to #user+#items - 1. During forward propagation, we apply two FC layers acting as the two individual projection layers. The results can be concatenated since they now have the same feature dimension and can be fed to DGLGraph as normal.

def some_forward_func(self, g, user_feats, item_feats):
    # user_feats shape: (#users, D1)
    # item_feats shape: (#items, D2)
    user_feats = self.user_fc(user_feats)  # now of shape (#users, D)
    item_feats = self.item_fc(item_feats)  # now of shape (#items, D)
    all_feats = th.cat([user_feats, item_feats], 0)
    g.ndata['h'] = all_feats
    g.update_all(...)
    # optionally, you may split them back
    user_feats, item_feats = th.split(g.ndata['h'], [num_users, num_items])
    return user_feats, item_feats

Faraz · September 11, 2020, 6:48am

@minjie would the approach also work if instead of the FC layer we just use some sort of padding to make the features (#user + #item, D1 + D2)

minjie · September 14, 2020, 1:00pm

Yes, that works too.

cs001632 · April 24, 2021, 6:38am

Did your team has finished the implementation for unsupervised GNN for heterogeneous graphs?

Sriharsha · April 24, 2021, 7:30am

We have a new HeteroGraphConv module that allows us to apply GraphSAGE on heterogeneous graphs. NN Modules (PyTorch) — DGL 0.6.1 documentation

cs001632 · April 24, 2021, 2:02pm

How to set its loss, if we focus on the edge between 2 types, not within 2 types?

Sriharsha · April 25, 2021, 8:13am

Shouldn’t be a problem in defining the loss. You can either use a task-specific loss like cross-entropy or a max-margin loss.
you can use the following code as a decoder and pass the output of this into a loss

class HeteroScorePredictor(nn.Module):
  def forward(self, g, h):
    with g.local_scope():
      g.ndata['h'] = h
      for etype in g.canonical_etypes:
        g.apply_edges(fn.u_dot_v('h', 'h', 'score'), etype=etype)
      return g.edata['score']

fodenn · April 25, 2021, 3:27pm

C’est très faisable.merci a vous

192.168.100.1 192.168.1

mufeili · April 26, 2021, 7:02am

Have you checked PanRep?