Heterogeneous graph and features vector size equalization


Based on the article https://docs.dgl.ai/en/latest/guide/training-link.html, I created a simple link prediction program. Heterogeneous graph, two types of nodes (user, product).

In all DGL documentation, the characteristics are given as follows:

hetero_graph.nodes ['user']. data ['feature'] = torch.randn (n_users, n_hetero_features)
hetero_graph.nodes ['item']. data ['feature'] = torch.randn (n_items, n_hetero_features)

What should I do when the feature vector is different for the user and the product? For example, a product has 10 features (price, category, promotion) and the user has only 2 (gender, country). Of course, the data is numerical.

Can I somehow process before network training? Eg dimension reduction?

Do I have to extend or change the model? I am using the model from the DGL documentation:

class Model(nn.Module):
    def __init__(self, in_features, hidden_features, out_features):
        self.sage = SAGE(in_features, hidden_features, out_features)
        self.pred = DotProductPredictor()
    def forward(self, g, neg_g, x):
        h = self.sage(g, x)
        return self.pred(g, h), self.pred(neg_g, h)

Thanks in advance for your answer

Yes. For instance, you could project each feature into a hidden embedding with torch.nn.Embedding and sum them up before feeding into your Model.

Thank you very much! I’m already starting to analyze and learn torch.nn.Embedding (maybe I’ll come back with questions).

However, I have one more question. If I create two types of users with features e.g. [0,0,0,0,0,1,1,1,1,1] and [1,1,1,1,1,0,0,0,0, 0] and two types of elements with the features [0,0,1,1] and [1,1,0,0]. Then I will create a graph with a combination of ‘user’, ‘clicked’, ‘item and’ item ‘,’ clicked-by ‘,’ user ', use torch.nn.Embedding and calculate the embedding correctly.

How do I apply PCA embedding dimensions reductions and visualize it (scatter_matrix). Can I see the appropriate distribution of users and items? For example, a user with identical features clicking on items with similar features close to each other. Is this a good way to check the code/algorithms?