Question of running 'apply_edges' in a contrastive model, a potential bug?

I want to implement a contrastive graph model. I have an encoder for two heterogeneous graphs G_raw and G_aug, where G_aug is obtained by removing some edges from G_raw. And the specific information are as follows:

 G_raw:
 Graph(num_nodes={'author': 7167, 'paper': 4019, 'subject': 60},
      num_edges={('author', 'author-paper', 'paper'): 13407, ('paper', 'paper-author', 'author'): 13407, ('paper', 'paper-subject', 'subject'): 4019, ('subject', 'subject-paper', 'paper'): 4019},
      metagraph=[('author', 'paper', 'author-paper'), ('paper', 'author', 'paper-author'), ('paper', 'subject', 'paper-subject'), ('subject', 'paper', 'subject-paper')]) 

G_aug:
Graph(num_nodes={'author': 7167, 'paper': 4019, 'subject': 60},
      num_edges={('author', 'author-paper', 'paper'): 11627, ('paper', 'paper-author', 'author'): 11568, ('paper', 'paper-subject', 'subject'): 3225, ('subject', 'subject-paper', 'paper'): 3232},
      metagraph=[('author', 'paper', 'author-paper'), ('paper', 'author', 'paper-author'), ('paper', 'subject', 'paper-subject'), ('subject', 'paper', 'subject-paper')])

The encoder works well on G_raw with the forward process encoder(G_raw)

but I met the following error when running the forward process on G_aug with encoder(G_aug):


 File "/root/Downloads/lyx/contra/godsake/contra_hgt/model/hyp_model.py", line 100, in forward
    sub_graph.apply_edges(fn.v_add_u('q', 'k', 't'))
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4463, in apply_edges
    self._set_e_repr(etid, eid, edata)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4238, in _set_e_repr
    self._edge_frames[etid].update(data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/_collections_abc.py", line 941, in update
    self[key] = other[key]
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 584, in __setitem__
    self.update_column(name, data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 661, in update_column
    raise DGLError('Expected data to have %d rows, got %d.' %
dgl._ffi.base.DGLError: Expected data to have 13407 rows, got 11627.

The corresponding code snippet is as follows:


def forward(self, G, h):
        with G.local_scope():
            for srctype, etype, dsttype in G.canonical_etypes:
                sub_graph = G[srctype, etype, dsttype]
                ......
                sub_graph.apply_edges(fn.v_add_u('q', 'k', 't')) #invoke error here

I print relevant data before sub_graph.apply_edges() and got


> subgraph:
Graph(num_nodes={'author': 7167, 'paper': 4019},
      num_edges={('author', 'author-paper', 'paper'): 11627},
      metagraph=[('author', 'paper', 'author-paper')]),  and:
sub_graph.srcdata['k'].shape: torch.Size([7167, 256])
sub_graph.dstdata['q'].shape: torch.Size([4019, 256])

I just don’t know why the sub_graph with 11627 edges ‘Expected data to have 13407 rows’ to apply_edges()

I noticed this question: Cannot assign edge data after g.remove_edges, but I still meet the similar problems. My dgl version is

dgl-cu110                 0.8.2.post1              pypi_0    pypi
dglgo                     0.0.1                    pypi_0    pypi

It seems that 13407 is the number of edges in G_raw and 11627 is that for G_aug. How did you invoke the forward function? And what is the relationship between G_raw and G_aug?

A reproducible code would be helpful here.

the forward process is invoked like this:

        z1 = encoding_model(G_raw, feat_key, target)
        z2 = encoding_model(G_aug, feat_key, target)

z1 is encoded successfully first but error occurs on G_aug.
13407 and 11627 are the numbers of same sub-graph (‘author’, ‘author-paper’, ‘paper’) in G_raw and G_aug.
As I mentioned , G_aug is obtained by remove some edges in G_raw. I tried to reproduce but the model is complicated. But I think the core question is why the sub_graph.apply_edges(fn.v_add_u('q', 'k', 't')) works well on G_raw first but not works on sub-graph with num_edges={(‘author’, ‘author-paper’, ‘paper’): 11627} and got following error message.

File "/root/Downloads/lyx/contra/godsake/contra_hgt/model/hyp_model.py", line 100, in forward
    sub_graph.apply_edges(fn.v_add_u('q', 'k', 't'))
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4463, in apply_edges
    self._set_e_repr(etid, eid, edata)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4238, in _set_e_repr
    self._edge_frames[etid].update(data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/_collections_abc.py", line 941, in update
    self[key] = other[key]
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 584, in __setitem__
    self.update_column(name, data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 661, in update_column
    raise DGLError('Expected data to have %d rows, got %d.' %
dgl._ffi.base.DGLError: Expected data to have 13407 rows, got 11627.

Because I found similar reports in this forum and github issues, I guess I met another bug here?

I tried the nightly version with the following code and it worked for me:

import dgl
import dgl.function as fn
import torch
g = dgl.heterograph({('A', 'AB', 'B'): ([1,2,3],[2,3,4]), ('B', 'BA', 'A'): ([2,3,4],[1,2,3])})
g.nodes['A'].data['x'] = torch.randn(4, 5)
g.nodes['B'].data['x'] = torch.randn(5, 5)
g.apply_edges(fn.u_add_v('x', 'x', 'w'))

sg = dgl.remove_edges(g, 2, etype='BA')
sg.apply_edges(fn.u_add_v('x', 'x', 'w'))

sg2 = sg['BA']
sg2.apply_edges(fn.u_add_v('x', 'x', 'w'))