in the dgl.nn.pytorch GatedGraphConv class.
the model defined includes a layer of embedding of shape (number_of_edges X output_feature_size**2 )
in the example of babi task given this output feature size is taken as task id /
example :
>>> import dgl
>>> from dgl.nn.pytorch import GatedGraphConv
>>> k = GatedGraphConv(in_feats=5, out_feats=10, n_steps=1, n_etypes=4)
>>> k
GatedGraphConv(
(edge_embed): Embedding(4, 100)
(gru): GRUCell(10, 10)
)
Why do we need to maintain this shape of embedding dict ?
I understand the model learns an embedding for each edge type and its message update goes through the gated gru cell.
but why is embedding vector length square of output_feature_size .
Shouldn’t it be number_of_edges X output_feature_size