Multi-dimensional input features

TheWind · July 3, 2022, 8:58am

Hi!I am building a heterogeneous graph with two kinds of nodes and one kind of edges like this:

    hetero_graph = dgl.heterograph({
    ('user', 'click', 'item'): (click_src, click_dst),
    ('item', 'clicked-by', 'user'): (click_dst, click_src)})

Each of nodes has multiple features, each with a different dimension, like this:

   hetero_graph.nodes['user'].data['feat'] = torch.randn(n_users, n_hetero_features)
   hetero_graph.nodes['user'].data['feat1'] = torch.randn(n_users, n_hetero_features)
   hetero_graph.nodes['item'].data['feat'] = torch.randn(n_items, 2*n_hetero_features)
   hetero_graph.nodes['item'].data['feat1'] = torch.randn(n_items, 2*n_hetero_features)
   hetero_graph.edges['click'].data['labels'] = torch.randint(0, 2, (n_clicks,))

I need to complete an edge classification task and the model looks like this:

class HighwayMLP(nn.Module):

    def __init__(self,
                 input_size,
                 gate_bias=-2,
                 activation_function=nn.functional.relu,
                 gate_activation=nn.functional.softmax):
        super(HighwayMLP, self).__init__()
        self.dropout = Dropout(0.4)
        self.activation_function = activation_function
        self.gate_activation = gate_activation
        self.normal_layer = nn.Linear(input_size, input_size)
        self.gate_layer = nn.Linear(input_size, input_size)
        self.gate_layer.bias.data.fill_(gate_bias)

    def forward(self, x):
        normal_layer_result = self.activation_function(self.dropout(self.normal_layer(x)))
        gate_layer_result = self.gate_activation(self.gate_layer(x))
        multiplyed_gate_and_normal = torch.mul(normal_layer_result, gate_layer_result)
        multiplyed_gate_and_input = torch.mul((1 - gate_layer_result), x)
        return torch.add(multiplyed_gate_and_normal,
                         multiplyed_gate_and_input)

class StochasticTwoLayerRGCN(nn.Module):
   def __init__(self, in_feat, hidden_feat, out_feat, rel_names):
       super().__init__()
       self.conv1 = dglnn.HeteroGraphConv({
               rel[1] : dglnn.GraphConv(in_feat[rel[0]], hidden_feat, norm='both')
               for rel in rel_names
           })
       self.conv2 = dglnn.HeteroGraphConv({
               rel[1] : dglnn.GraphConv(hidden_feat, out_feat, norm='both')
               for rel in rel_names
           })

   def forward(self, blocks, x):
       x = self.conv1(blocks[0], x)
       x = self.conv2(blocks[1], x)
       return x

class ScorePredictor(nn.Module):
    def __init__(self, num_classes, in_features):
        super().__init__()
        self.W = nn.Linear(2 * in_features, num_classes)
        len_highway_input = in_features * 2
        self.highway_layer = HighwayMLP(len_highway_input)
        len_linear_input = len_highway_input
        self.label = nn.Linear(len_linear_input, 2)

    def apply_edges(self, edges):
        # data = torch.cat([edges.src['x'], edges.dst['x']], dim=-1)
        # edges.src['x']是左节点， edges.dst['x']是右节点
        score = self.highway_layer(torch.cat([edges.src['x'], edges.dst['x']], dim=-1))
        score = self.label(score)
        output = F.log_softmax(score, dim=1)
        # return {'score': self.W(data)}
        return {'score': output}

    def forward(self, edge_subgraph, x):
        with edge_subgraph.local_scope():
            edge_subgraph.ndata['x'] = x
            for etype in edge_subgraph.canonical_etypes:
                edge_subgraph.apply_edges(self.apply_edges, etype=etype)
            return edge_subgraph.edata['score']

class Model(nn.Module):
    def __init__(self, in_features, hidden_features, out_features, num_classes,
                 etypes):
        super().__init__()
        self.rgcn = StochasticTwoLayerRGCN(
            in_features, hidden_features, out_features, etypes)
        self.pred = ScorePredictor(num_classes, out_features)

    def forward(self, edge_subgraph, blocks, x):
        x = self.rgcn(blocks, x)
        return self.pred(edge_subgraph, x)

Now, how should I modify so that my model can use both feat and feat1?

----------------------------------------Dividing line----------------------------------------

Or another case where I make each node feature multi-dimensional? for example:

hetero_graph.nodes['user'].data['feat'] = torch.randn(n_users, 2, n_hetero_features)
hetero_graph.nodes['item'].data['feat'] = torch.randn(n_items, 2, 2*n_hetero_features)

At this point each of my nodes have features of shape 2n_hetero_features, or 2(2*n_hetero_features)
I saw a mention of stack multiple input features in FAQ 18. Are there any specific examples that can be used for reference? @mufeili

----------------------------------------Dividing line----------------------------------------
The last question is, what if each of my classes of nodes contains two features, such as feat and feat1 below:

   hetero_graph.nodes['user'].data['feat'] = torch.randn(n_users, n_hetero_features)
   hetero_graph.nodes['user'].data['feat1'] = torch.randn(n_users, n_hetero_features)
   hetero_graph.nodes['item'].data['feat'] = torch.randn(n_items, 2*n_hetero_features)
   hetero_graph.nodes['item'].data['feat1'] = torch.randn(n_items, 2*n_hetero_features)
   hetero_graph.edges['click'].data['labels'] = torch.randint(0, 2, (n_clicks,))

When I perform the edge classification task, I only pass one feature to the RGCN, assuming that the feat is passed in. So in the round after round of training, will the tensor of feat1 be updated?

If you can get your answers, I would be very grateful!

mufeili · July 4, 2022, 3:07am

Now, how should I modify so that my model can use both feat and feat1?

Does it work for you to simply concatenate feat and feat1?

I saw a mention of stack multiple input features in FAQ 18. Are there any specific examples that can be used for reference?

I’ve updated the FAQ to include an example there.

When I perform the edge classification task, I only pass one feature to the RGCN, assuming that the feat is passed in. So in the round after round of training, will the tensor of feat1 be updated?

The model does not update the input node features. Instead, it computes updated node representations from the node features you provide. If feat1 is not utilized in your model, then its information is not utilized in computing the node representations.

TheWind · July 4, 2022, 12:22pm

Thanks for your reply!

system · August 7, 2022, 8:44am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.