[Blog] Understand Graph Attention Network

RaghavanD · June 3, 2019, 10:14am

Yes exactly but how to instrument GAT for handling multiple features. Refer the Blog initiated by me Multiple Node & Edge features where i got some answer but not exactly.

We are designing a Graph Representation learning model to spot patterns in detecting fraudulent transactions and also for defect detection/identification in finance domain.Here the patterns are detected/identified through connected nodes/edges of various critical parameters/indicators. Each nodes will have many features with corresponding values in it and even the Nodes/Edge label are also one of the factor in the representation learning.

mufeili · June 3, 2019, 4:45pm

Let’s say we have two kind of node features n1, n2, and two kind of edge features e1, e2, what can be the detailed node/edge feature update you would like to perform (e.g. in the form of math equations)? I’ll try to prototype it with dgl.

RaghavanD · June 4, 2019, 9:48am

Please find attached an example with basic table to identify error patterns with two parameters “Err Ind” and “Err Weightage” which are derived from various other features … “Autual”, “Recorded”, “Diff”, “Diff in Value”… An UDF will be opt for this kind of scenarios and the expression may be specific to nodes/edges. Pl suggest

mufeili · June 5, 2019, 6:44am

The table you present suggests that you have multiple node features. Then the simplest thing I may try would be to concatenate all node features first and then perform a GNN node feature update as usual. I may also do:

Add residual connection to preserve original node feature.
Replace a linear layer by a MLP in GNN node feature update.

RaghavanD · June 6, 2019, 7:18am

Thanks for your quick update. Few queries from your inference
#1 Whether deep learning on node / edge label is factored in your recommendation
#2 How to handle a node feature which has string as values and sometimes string & numeric value together
#3 Is there any provision for leanings on directed graph representation in DGL ? Still I believe we may have work around to store it as one of node/edge features.

mufeili · June 6, 2019, 5:03pm

Just to get started, you may treat edge labels as additional messages which can be sent with dgl.function.copy_edge. You can concatenate edge labels with the features of source nodes and perform a similar operation as my previous proposal.
I will maintain a mapping between them and one-hot encodings/embeddings so that we can get rid of strings.
You can use dgl for directed graphs. Either you learn on raw directed graphs or you can add edges for the other direction and perform different operations on the raw edges and the reverse edges, for which our reverse transformation might be helpful.

RaghavanD · June 11, 2019, 9:51am

Refer to your recommendation
*** Add residual connection to preserve original node feature.**
What is residual connection mean? it is just adding some more features to a node.
*** Replace a linear layer by a MLP in GNN node feature update.**
I believe you refer here as feature extraction concepts. We know message passing techniques by reduce and apply nodes. Do you refer something else? Can you please help me with some code snippets / link on feature update / MLP layer in DGL.

mufeili · June 11, 2019, 6:03pm

For residual connection, you can concatenate the updated node features with raw node features so that no information loss happens.
211508×698 34.2 KB

AneeshaaSC · July 4, 2019, 4:39pm

Does the GAT take into account edge features/attributes? If not, how to include them?

AneeshaaSC · July 4, 2019, 4:40pm

How to create the kind of graph visualization shown in the link ?

zihao · July 5, 2019, 11:20am

No, GAT does not take edge feature into account.
It’s possible to extend GAT so that it could handle edge information, for example:

where r_{uv}^K and r_{uv}^V are two set of edge features determined by edge type. (Reference: https://www.aclweb.org/anthology/N18-2074).

Of course you could try other methods too.

mufeili · July 5, 2019, 1:43pm

See How to plot the attention weights...?

guotong1988 · November 19, 2019, 3:14am

Same question. Thank you.

tanjia123456 · November 11, 2020, 2:21am

I want to solve this problem, But I haven’t find ang solution. If you know, plaese send me a email(1248742467@qq.com)

Theertha · April 5, 2021, 6:56am

Can I use the above code for node regression model.Apart from changes in the output layer is there anything which we need to change to achieve the same.

mufeili · April 5, 2021, 12:19pm

I think it’s the same.

green-cabbage · July 14, 2021, 4:11am

Is there a way to obtain the EdgeBatch “edges” argument from the edge_attention function?
I am trying to set up a custom edge_attention function that also takes the edge type (etype) as an argument. But for that to work, I have to also pass “edges,” an EdgeBatch instance.
Please see below my attempt at this custom function. Thanks in advance.

def edge_attention(self, edges, edgetype):
# edge UDF for equation (2)
z_str = edgetype + ‘_z’
e_str = edgetype + ‘_e’
z2 = torch.cat([edges.src[z_str], edges.dst[z_str]], dim=1)
# forward attention layer
key = edgetype+ "att"
a = self.n_embedskey
return {e_str: F.leaky_relu(a)}

green-cabbage · July 14, 2021, 6:38am

Nevermind, you can just do g.apply_edges(attn_func, etype= etype) instead of making the attn_func cater to different edge types

EDIT: now I am stuck on how to make “a = self.attn_fc(z2)” portion of edge_attention function to cater to different edge types

BarclayII · July 19, 2021, 7:12am

You can have a separate attn_fc module for each edge type.

It seems that you are applying this on a heterogeneous graph. Please feel free to open a new thread if you would like to follow up. Thanks!

minjie · July 26, 2021, 7:18am