Hi, I’m trying to treat a batch of tensors as a batch of adjacency matrices (batch_of_adj: tensor of size [8, 512, 512], meaning 8 graphs of 512 nodes each) and build a batch of DGLGraphs with them. Assume that each adj matrix is positive and thresholded at 0.7 (so that all values are either >0.7 or 0).
Currently, I’m doing something like this to build the graphs, pass through a model to update nodes, and stack the nodes features back to one batched tensor for further processing:
index = [adj.nonzero(as_tuple=False).t().contiguous() for adj in batch_of_adj]
values = [adj[idx[0], idx[1]] for idx, adj in zip(index, batch_of_adj)]
graphs = [dgl.graph((idx[0], idx[1])) for idx in index]
for i, g in enumerate(graphs):
g.ndata['h'] = features[i] #features also of shape [8, 512, hidden_dim]
g.edata['w'] = values[i]
g = dgl.batch(graphs)
g = GraphModel(g) # GraphModel updates nodes features in batched graph g
gs = dgl.unbatch(g) #unbatch to get updated features back in batched form
graph_output = torch.stack([g.ndata['h'] for g in gs])
A weird thing I’ve noticed is that changing the value of the “threshold” hyperparameter I use to compute the weighted adjacency matrices (in this example threshold=0.7) has no effect on loss or the accuracy (even threshold=0.01 vs threshold=0.99 have identical losses), which makes me think that something is going wrong.
I’ve confirmed that different threshold values lead to graphs of different sizes (higher threshold means smaller graphs due to fewer edges). My suspicion is that there’s something about the 3 lines for index, values, and graphs that’s interfering with the threshold value having any effect on the loss even though it results in graphs with different sizes.
I have 2 main questions:
- Is there a better way to construct the graphs from batched adj matrices than what I did with the 3 lines for index/values/graphs?
- How would you explain the different threshold values resulting in identical loss and accuracy?
I’d really appreciate any insights!