How to propagate unique labels the best way?

David1 · February 19, 2023, 6:55pm

Hi there,

I build a heterogeneous Graph using Pytorch. The graph structure looks like this: (author, writes, paper), (paper, authored, author), (paper, cites, paper) and I try to predict the “authored” relation based on the cited papers and their references.
Each paper and author represent unique classes with unique labels. Therefore I need to propagate that information somehow. As the current solution, I converted the ids through an embedding layer and inserted them into the nodes’ features. Are there current models out there that can learn from their neighboring labels directly?
And is it the only and proper way to convert it to a lower space vector using embeddings?
(One-hot encoding did not work due to low memory.)

Thanks in advance.

dyru · February 24, 2023, 11:11am

Hi @David1,

From your description, I think you’re trying a heterogeneous link prediction problem in the transductive setting. DGL provides a guide for this task here. The basic idea here is to construct node embeddings for each node in your graph and learn to predict links based on propagated features of corresponding node pairs.

By labels, I guess you mean node ids, right? Constructing learnable node embeddings is indeed the principled approach for obtaining node features if you have no other features attached to nodes. And you don’t need to save one-hot encodings but use those learnable node embeddings as node features directly.

David1 · February 24, 2023, 11:36am

Hey @dyru,

Thanks for your reply.

So you mean, instead of using the one hot encoded label ids as features, I should instead input the ids as learnable embedding?
I tried that approach but it performs poorly somehow.

dyru · February 27, 2023, 4:20am

IDs are not learnable. Use learnable dense embeddings like this:

import torch as th
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self, num_paper, num_author, dim):
        ...
        # dim is a hyperparameter up to you
        self.paper_emb = nn.Parameter(th.empty(num_paper, dim))
        self.author_emb = nn.Parameter(th.empty(num_author, dim))
        ...
    ...

Then directly use the learnable embeddings as node features (please take care of parameter initialization though).

system · March 29, 2023, 4:20am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.