Inputs to SpMM and SDDMM

Hi! Question about internals/FeatGraph here. It’s obvious to me what happens to the rhs_data for these (GE-SpMM/SDDMM) functions, but I’m interested in what’s going on with lhs_data for both of them. I would expect the LHS to be the sparse adjacency matrix. Is the lhs_data being substituted into the sparse matrix for the values, or even originally stored as the sparse matrix?

Could someone post a brief pseudocode? I couldn’t find any in the papers, and once I got to templates in the repo I figured I’d ask here before diving deeper.

The the sparse adjacency matrix was stored in DGLGraph (as the first argument to SpMM/SDDMM operators). Both lhs_data and rhs_data are dense matrices.

Documentation on dgl.ops might help.

I understand that. My question is what’s the actual operation in SpMM. I would only expect 2 inputs, the sparse adjacency matrix and the dense node features. Instead SpMM has three inputs.

Hi @Pangurbon, DGL implements *generalized-*SpMM instead of regular SpMM. As such, the interface is extended as well. Take dgl.ops.gspmm for example:

  • The first argument is a DGLGraph which is equivalent to the sparse adjacency matrix.
  • The second and third arguments specify the binary operation between source and edge data, and the reduce operation to aggregate the messages.
  • lhs_data and rhs_data are all dense tensors. In gspmm, lhs_data is node features while rhs_data is edge features, all batched along the first dimension.

You may find that in regular spmm, the edge weights are stored as non-zero values of the sparse matrix and wonder why DGL does not do that. This is because DGL’s gSpMM supports high-dimensional, non-scalar edge weights, so it becomes an eplicit argument.