SAGEConv--- Error

Piento28 · October 24, 2019, 5:30am

Hi,

I am now trying to implement the GraphSAGE in a giant graph.

At first, my plan is trying to use NodeFlow and SAGEconv, but due to SAGEConv only accept DGLGraph and NodeFlow could not be transferred into DGLGraph for now, then I give up this idea.

Then I tried original DGLGraph + SAGEConv ( I know this could be very low efficient), but got this complicated error information:

---------------------------------------------------------------------------
DGLError                                  Traceback (most recent call last)
<ipython-input-7-7b1b03d1411a> in <module>
     35     for epoch in range(num_epochs):
     36         epoch_loss = 0
---> 37         prediction = model(g, cora_nodes_features[:34])
     38         print(prediction.shape)
     39 #         loss = loss_func(prediction, ground_truth)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-6-6679e06a574f> in forward(self, graph, features)
     15         h = features
     16         for layer in self.layers:
---> 17             h = layer(g, h)
     18         return h

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/nn/pytorch/conv/sageconv.py in forward(self, graph, feat)
    108         if self._aggre_type == 'mean':
    109             graph.ndata['h'] = feat
--> 110             graph.update_all(fn.copy_src('h', 'm'), fn.mean('m', 'neigh'))
    111             h_neigh = graph.ndata['neigh']
    112         elif self._aggre_type == 'gcn':

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/graph.py in update_all(self, message_func, reduce_func, apply_node_func)
   2745                                           reduce_func=reduce_func,
   2746                                           apply_func=apply_node_func)
-> 2747             Runtime.run(prog)
   2748 
   2749     def prop_nodes(self,

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/runtime.py in run(prog)
      9         for exe in prog.execs:
     10             # prog.pprint_exe(exe)
---> 11             exe.run()

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/ir/executor.py in run(self)
   1198         self.ret.data = F.copy_reduce(
   1199             self.reducer, graph, self.target, in_data, self.out_size, in_map,
-> 1200             out_map)
   1201 
   1202 

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py in forward(ctx, reducer, graph, target, in_data, out_size, in_map, out_map)
    371         K.copy_reduce(
    372             reducer if reducer != 'mean' else 'sum',
--> 373             graph, target, in_data_nd, out_data_nd, in_map[0], out_map[0])
    374         # normalize if mean reducer
    375         # NOTE(zihao): this is a temporary hack and we should have better solution in the future.

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/kernel.py in copy_reduce(reducer, G, target, X, out, X_rows, out_rows)
    370     _CAPI_DGLKernelCopyReduce(
    371         reducer, G, int(target),
--> 372         X, out, X_rows, out_rows)
    373 
    374 # pylint: disable=invalid-name

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FunctionBase.__call__()

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FuncCall()

dgl/_ffi/_cython/./base.pxi in dgl._ffi._cy3.core.CALL()

DGLError: [14:24:40] /Users/xiangsx/work/dgl/dgl/src/kernel/cpu/../binary_reduce_impl.h:112: Unsupported dtype: _@
Stack trace:
  [bt] (0) 1   libdgl.dylib                        0x0000000120bb1309 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (1) 2   libdgl.dylib                        0x000000012113dc9d void dgl::kernel::BinaryReduceImpl<1>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, dgl::kernel::CSRWrapper const&, dgl::kernel::binary_op::Target, dgl::kernel::binary_op::Target, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray) + 1037
  [bt] (2) 3   libdgl.dylib                        0x0000000120bd4ce7 dgl::kernel::CopyReduce(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, dgl::kernel::CSRWrapper const&, dgl::kernel::binary_op::Target, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray) + 2151
  [bt] (3) 4   libdgl.dylib                        0x0000000120bde991 std::__1::__function::__func<dgl::kernel::$_4, std::__1::allocator<dgl::kernel::$_4>, void (dgl::runtime::DGLArgs, dgl::runtime::DGLRetValue*)>::operator()(dgl::runtime::DGLArgs&&, dgl::runtime::DGLRetValue*&&) + 1473
  [bt] (4) 5   libdgl.dylib                        0x0000000121368de6 DGLFuncCall + 70
  [bt] (5) 6   core.cpython-37m-darwin.so          0x000000012197f86c __pyx_f_3dgl_4_ffi_4_cy3_4core_FuncCall(void*, _object*, DGLValue*, int*) + 924
  [bt] (6) 7   core.cpython-37m-darwin.so          0x0000000121983c27 __pyx_pw_3dgl_4_ffi_4_cy3_4core_12FunctionBase_5__call__(_object*, _object*, _object*) + 55
  [bt] (7) 8   python                              0x0000000103a82e03 _PyObject_FastCallKeywords + 179
  [bt] (8) 9   python                              0x0000000103bbfd75 call_function + 453

It seems caused by SAGEConv…Any help?

Thank you~

VoVAllen · October 24, 2019, 5:59am

Hi,

What’s the type of cora_nodes_features[:34]? Is it float?

Piento28 · October 24, 2019, 6:19am

YES!

It is cora dataset’s feature with size (2078, 1433), each element is 0.0/1.0 float. :34 is just because I create a small graph to validate the SAGEConv…

After I changing the type to int. I got something different:

---------------------------------------------------------------------------
DGLError                                  Traceback (most recent call last)
<ipython-input-7-7b1b03d1411a> in <module>
     35     for epoch in range(num_epochs):
     36         epoch_loss = 0
---> 37         prediction = model(g, cora_nodes_features[:34])
     38         print(prediction.shape)
     39 #         loss = loss_func(prediction, ground_truth)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-6-6679e06a574f> in forward(self, graph, features)
     15         h = features
     16         for layer in self.layers:
---> 17             h = layer(g, h)
     18         return h

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/nn/pytorch/conv/sageconv.py in forward(self, graph, feat)
    108         if self._aggre_type == 'mean':
    109             graph.ndata['h'] = feat
--> 110             graph.update_all(fn.copy_src('h', 'm'), fn.mean('m', 'neigh'))
    111             h_neigh = graph.ndata['neigh']
    112         elif self._aggre_type == 'gcn':

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/graph.py in update_all(self, message_func, reduce_func, apply_node_func)
   2745                                           reduce_func=reduce_func,
   2746                                           apply_func=apply_node_func)
-> 2747             Runtime.run(prog)
   2748 
   2749     def prop_nodes(self,

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/runtime.py in run(prog)
      9         for exe in prog.execs:
     10             # prog.pprint_exe(exe)
---> 11             exe.run()

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/ir/executor.py in run(self)
   1198         self.ret.data = F.copy_reduce(
   1199             self.reducer, graph, self.target, in_data, self.out_size, in_map,
-> 1200             out_map)
   1201 
   1202 

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py in forward(ctx, reducer, graph, target, in_data, out_size, in_map, out_map)
    371         K.copy_reduce(
    372             reducer if reducer != 'mean' else 'sum',
--> 373             graph, target, in_data_nd, out_data_nd, in_map[0], out_map[0])
    374         # normalize if mean reducer
    375         # NOTE(zihao): this is a temporary hack and we should have better solution in the future.

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/kernel.py in copy_reduce(reducer, G, target, X, out, X_rows, out_rows)
    370     _CAPI_DGLKernelCopyReduce(
    371         reducer, G, int(target),
--> 372         X, out, X_rows, out_rows)
    373 
    374 # pylint: disable=invalid-name

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FunctionBase.__call__()

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FuncCall()

dgl/_ffi/_cython/./base.pxi in dgl._ffi._cy3.core.CALL()

DGLError: [15:25:59] /Users/xiangsx/work/dgl/dgl/src/kernel/cpu/../binary_reduce_impl.h:112: Unsupported dtype:

VoVAllen · October 24, 2019, 6:56am

Could you print out the result of cora_nodes_features[:34].dtype and type(cora_nodes_features[:34])?

Piento28 · October 30, 2019, 1:07am

Sorry for the late reply:

I tried both float and int type, it returned such error information respectively:
cora_nodes_features = np.zeros((num_nodes,1433),dtype=float):

The number of train nodes is: 270
The number of test nodes is: 2438
torch.float64
<class 'torch.Tensor'>
---------------------------------------------------------------------------
DGLError                                  Traceback (most recent call last)
<ipython-input-7-7e5e2e7494a4> in <module>
     38     for epoch in range(num_epochs):
     39         epoch_loss = 0
---> 40         prediction = model(g, cora_nodes_features[:34])
     41         print(prediction.shape)
     42 #         loss = loss_func(prediction, ground_truth)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-6-ae841694aa8a> in forward(self, graph, features)
     16         h = features
     17         for layer in self.layers:
---> 18             h = layer(g, h)
     19         return h

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/nn/pytorch/conv/sageconv.py in forward(self, graph, feat)
    108         if self._aggre_type == 'mean':
    109             graph.ndata['h'] = feat
--> 110             graph.update_all(fn.copy_src('h', 'm'), fn.mean('m', 'neigh'))
    111             h_neigh = graph.ndata['neigh']
    112         elif self._aggre_type == 'gcn':

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/graph.py in update_all(self, message_func, reduce_func, apply_node_func)
   2745                                           reduce_func=reduce_func,
   2746                                           apply_func=apply_node_func)
-> 2747             Runtime.run(prog)
   2748 
   2749     def prop_nodes(self,

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/runtime.py in run(prog)
      9         for exe in prog.execs:
     10             # prog.pprint_exe(exe)
---> 11             exe.run()

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/ir/executor.py in run(self)
   1198         self.ret.data = F.copy_reduce(
   1199             self.reducer, graph, self.target, in_data, self.out_size, in_map,
-> 1200             out_map)
   1201 
   1202 

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py in forward(ctx, reducer, graph, target, in_data, out_size, in_map, out_map)
    371         K.copy_reduce(
    372             reducer if reducer != 'mean' else 'sum',
--> 373             graph, target, in_data_nd, out_data_nd, in_map[0], out_map[0])
    374         # normalize if mean reducer
    375         # NOTE(zihao): this is a temporary hack and we should have better solution in the future.

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/kernel.py in copy_reduce(reducer, G, target, X, out, X_rows, out_rows)
    370     _CAPI_DGLKernelCopyReduce(
    371         reducer, G, int(target),
--> 372         X, out, X_rows, out_rows)
    373 
    374 # pylint: disable=invalid-name

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FunctionBase.__call__()

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FuncCall()

dgl/_ffi/_cython/./base.pxi in dgl._ffi._cy3.core.CALL()

DGLError: [10:03:43] /Users/xiangsx/work/dgl/dgl/src/kernel/cpu/../binary_reduce_impl.h:112: Unsupported dtype: _@
Stack trace:
  [bt] (0) 1   libdgl.dylib                        0x000000011f56d309 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (1) 2   libdgl.dylib                        0x000000011faf9c9d void dgl::kernel::BinaryReduceImpl<1>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, dgl::kernel::CSRWrapper const&, dgl::kernel::binary_op::Target, dgl::kernel::binary_op::Target, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray) + 1037
  [bt] (2) 3   libdgl.dylib                        0x000000011f590ce7 dgl::kernel::CopyReduce(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, dgl::kernel::CSRWrapper const&, dgl::kernel::binary_op::Target, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray) + 2151
  [bt] (3) 4   libdgl.dylib                        0x000000011f59a991 std::__1::__function::__func<dgl::kernel::$_4, std::__1::allocator<dgl::kernel::$_4>, void (dgl::runtime::DGLArgs, dgl::runtime::DGLRetValue*)>::operator()(dgl::runtime::DGLArgs&&, dgl::runtime::DGLRetValue*&&) + 1473
  [bt] (4) 5   libdgl.dylib                        0x000000011fd24de6 DGLFuncCall + 70
  [bt] (5) 6   core.cpython-37m-darwin.so          0x000000012033b86c __pyx_f_3dgl_4_ffi_4_cy3_4core_FuncCall(void*, _object*, DGLValue*, int*) + 924
  [bt] (6) 7   core.cpython-37m-darwin.so          0x000000012033fc27 __pyx_pw_3dgl_4_ffi_4_cy3_4core_12FunctionBase_5__call__(_object*, _object*, _object*) + 55
  [bt] (7) 8   python                              0x0000000102441e03 _PyObject_FastCallKeywords + 179
  [bt] (8) 9   python                              0x000000010257ed75 call_function + 453

cora_nodes_features = np.zeros((num_nodes,1433),dtype=int)

The number of train nodes is: 270
The number of test nodes is: 2438
torch.int64
<class 'torch.Tensor'>
---------------------------------------------------------------------------
DGLError                                  Traceback (most recent call last)
<ipython-input-11-7e5e2e7494a4> in <module>
     38     for epoch in range(num_epochs):
     39         epoch_loss = 0
---> 40         prediction = model(g, cora_nodes_features[:34])
     41         print(prediction.shape)
     42 #         loss = loss_func(prediction, ground_truth)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-10-ae841694aa8a> in forward(self, graph, features)
     16         h = features
     17         for layer in self.layers:
---> 18             h = layer(g, h)
     19         return h

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/nn/pytorch/conv/sageconv.py in forward(self, graph, feat)
    108         if self._aggre_type == 'mean':
    109             graph.ndata['h'] = feat
--> 110             graph.update_all(fn.copy_src('h', 'm'), fn.mean('m', 'neigh'))
    111             h_neigh = graph.ndata['neigh']
    112         elif self._aggre_type == 'gcn':

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/graph.py in update_all(self, message_func, reduce_func, apply_node_func)
   2745                                           reduce_func=reduce_func,
   2746                                           apply_func=apply_node_func)
-> 2747             Runtime.run(prog)
   2748 
   2749     def prop_nodes(self,

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/runtime.py in run(prog)
      9         for exe in prog.execs:
     10             # prog.pprint_exe(exe)
---> 11             exe.run()

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/runtime/ir/executor.py in run(self)
   1198         self.ret.data = F.copy_reduce(
   1199             self.reducer, graph, self.target, in_data, self.out_size, in_map,
-> 1200             out_map)
   1201 
   1202 

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py in forward(ctx, reducer, graph, target, in_data, out_size, in_map, out_map)
    371         K.copy_reduce(
    372             reducer if reducer != 'mean' else 'sum',
--> 373             graph, target, in_data_nd, out_data_nd, in_map[0], out_map[0])
    374         # normalize if mean reducer
    375         # NOTE(zihao): this is a temporary hack and we should have better solution in the future.

/opt/miniconda3/envs/PyTorch/lib/python3.7/site-packages/dgl/kernel.py in copy_reduce(reducer, G, target, X, out, X_rows, out_rows)
    370     _CAPI_DGLKernelCopyReduce(
    371         reducer, G, int(target),
--> 372         X, out, X_rows, out_rows)
    373 
    374 # pylint: disable=invalid-name

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FunctionBase.__call__()

dgl/_ffi/_cython/./function.pxi in dgl._ffi._cy3.core.FuncCall()

dgl/_ffi/_cython/./base.pxi in dgl._ffi._cy3.core.CALL()

DGLError: [10:06:40] /Users/xiangsx/work/dgl/dgl/src/kernel/cpu/../binary_reduce_impl.h:112: Unsupported dtype:

bright1993ff66 · March 22, 2020, 2:01pm

I just encountered the same issue . Have you found solution?

VoVAllen · March 23, 2020, 5:57am

The reason is due to the dtype of feature is double instead of float. Something like feat=feat.float() could fix this

bright1993ff66 · March 24, 2020, 3:17am

Thank you very much for your reply!

Your solution is great. It fixes the bugs, thanks!