The node features' data type torch.float16 doesn't match edge features' data type torch.float32, please convert them to the same type

As shown in the title, I had an error when training a graph neural network, where my model settings are as follows:
model = GNAE(features.shape[1],256).half()
dataset = dgl.data.CoraGraphDataset(verbose=False)
dgl_g = dataset[0]
class GNAE(nn.Module):
def init(self, in_channels, hide_channels):
super(GNAE, self).init()
self.linear1 = nn.Linear(in_channels, hide_channels)
self.propagate = APPNPConv(k=1, alpha=0)

def forward(self, g, features):
    x = features
    x = self.linear1(x)
    x = F.normalize(x,p=2,dim=1)  * 1.8
    x = self.propagate(g,x)
    return x

As shown in my model, I plan to use. half() to reduce memory overhead, so I will also convert features to float16 type. However, when running the code, it prompts that the data types of edges and nodes do not match. However, my model does not use edge features, the edge feature of dgl_g is also empty. The specific error message is as follows:
Traceback (most recent call last):
File “/home/lyx/program/pytorch_LinkPrediction_encoder_decoder.py”, line 442, in
logits = model(dgl_g.to(device), features.to(device))
File “/usr/local/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1194, in _call_impl
return forward_call(*input, **kwargs)
File “/home/lyx/program/pytorch_LinkPrediction_encoder_decoder.py”, line 62, in forward
x = self.propagate(g,x)
File “/usr/local/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1194, in _call_impl
return forward_call(*input, **kwargs)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/nn/pytorch/conv/appnpconv.py”, line 117, in forward
graph.update_all(fn.u_mul_e(“h”, “w”, “m”), fn.sum(“m”, “h”))
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/heterograph.py”, line 5110, in update_all
ndata = core.message_passing(
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/core.py”, line 388, in message_passing
ndata = invoke_gspmm(g, mfunc, rfunc)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/core.py”, line 349, in invoke_gspmm
z = op(graph, x, y)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/ops/spmm.py”, line 173, in func
return gspmm(g, binary_op, reduce_op, x, y)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/ops/spmm.py”, line 79, in gspmm
ret = gspmm_internal(
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/backend/pytorch/sparse.py”, line 1032, in gspmm
return GSpMM.apply(*args)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/backend/pytorch/sparse.py”, line 165, in forward
out, (argX, argY) = _gspmm(gidx, op, reduce_op, X, Y)
File “/usr/local/anaconda3/lib/python3.9/site-packages/dgl/_sparse_ops.py”, line 203, in _gspmm
raise DGLError(
dgl._ffi.base.DGLError: The node features’ data type torch.float16 doesn’t match edge features’ data type torch.float32, please convert them to the same type.

The error is caused by a tensor initialized during the forward of APPNP https://github.com/dmlc/dgl/blob/fd4ce7cc7cf003c451ee89ad26abc67de0d29354/python/dgl/nn/pytorch/conv/appnpconv.py#L112. We will not support half precision training in a broader scale recently. For a temporary workaround, you can modify that line to match the data type.

Thank you for your reply, I would learn about it.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.