Confuse about the num_hidden parameter in GATConv

xixi-baba · October 14, 2019, 3:02am

In the API documentation ：
https://docs.dgl.ai/api/python/nn.pytorch.html?highlight=gatconv#dgl.nn.pytorch.conv.GATConv
the API GATConv doesn’t contain a parameter called num-hidden

however, in the GAT example of pytorch, there is a num-hidden
%E5%BE%AE%E4%BF%A1%E5%9B%BE%E7%89%87_20191014105642

Besides, the argparse part in train.py use ‘num-hidden’ rather than ‘num_hidden’, but the code still works, why was that?
parse

I’m confused about the meaning of these parameter , whether is it needed and are ‘num-hidden’ and ‘num_hidden’ the same?

mufeili · October 14, 2019, 5:56am

For the num_hidden in GATConv, since we did not explicitly specify the argument name, this will be just the second parameter for GATConv, i.e. out_feats.
With argparse, any argument of name --A-B-...-C will become an attribute with name A_B_..._C.

xixi-baba · October 14, 2019, 8:19am

thanks
so does num-hidden mean the dimention of the representation of the nodes in the hidden layers?

mufeili · October 14, 2019, 4:28pm

Here they are in fact the output size of a GATConv layer. Since the model consists of multiple layers, they are also the hidden size of the whole model.

xixi-baba · November 26, 2019, 11:46am

sorry to reply late, But I’m still confused about the “hidden units” in GNN models , since In GNN models, it seems that there is no hidden units like that in CNN models. So what’s the hidden units on earth?

mufeili · November 26, 2019, 3:43pm

Let’s say your input node features X is a matrix of shape (N, M_1), where N is the number of nodes, and M_1 is the size of node feature. With each GNN layer, we also project the node features with

H=XW+b

such that H is a matrix of shape (N, M_2). The num_hidden denotes stuff like M_2.

xixi-baba · November 28, 2019, 10:45am

So as I mentioned before,it’s the dimention of the hidden representation ,right?
But In the example of cluster_gcn，run_ppi.sh，the num_hidden parameter was 2048, while the dimention of the raw feature of ppi_dataset was 50, I do not quite understand why to transform a 50-dimensional vector into a 2048-dimensional vector?

mufeili · November 28, 2019, 12:05pm

We want our model to have sufficient capacity for representation learning. The larger the hidden dimension is, the more parameters we can use and the more powerful our model is. May I ask whether you had background in deep learning before?

xixi-baba · November 28, 2019, 12:22pm

thanks for replying ，I took some online courses and read some gnn papers before.Is my question too stupid?