cudaErrorInvalidDevice in DGLGraph.to() to copy the graph to gpu 1

python 3.6.9
tensorflow:2.2.2
CUDA Version 10.1.243
my codes is

    g = data[0]
    if args.gpu < 0:
        device = "/cpu:0"
    else:
        device = "/gpu:1"
        g = g.to(device)

The error is :

/gpu:1
Traceback (most recent call last):
  File "/root/share/orion-perf/examples/tensorflow/node_classification/v2.2/main_gat.py", line 4, in <module>
    run_gat.run()
  File "/root/share/orion-perf/examples/tensorflow/node_classification/common/run_gat.py", line 127, in run
    g = g.to(device)
  File "/usr/local/lib/python3.6/dist-packages/dgl/heterograph.py", line 5192, in to
    ret._graph = self._graph.copy_to(utils.to_dgl_context(device))
  File "/usr/local/lib/python3.6/dist-packages/dgl/heterograph_index.py", line 234, in copy_to
    return _CAPI_DGLHeteroCopyTo(self, ctx.device_type, ctx.device_id)
  File "/usr/local/lib/python3.6/dist-packages/dgl/_ffi/_ctypes/function.py", line 190, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/usr/local/lib/python3.6/dist-packages/dgl/_ffi/base.py", line 64, in check_call
    raise DGLError(py_str(_LIB.DGLGetLastError()))
dgl._ffi.base.DGLError: [06:35:06] /opt/dgl/src/runtime/cuda/cuda_device_api.cc:93: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: cudaErrorInvalidDevice
Stack trace:
  [bt] (0) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7f8889d8901f]
  [bt] (1) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::runtime::CUDADeviceAPI::AllocDataSpace(DLContext, unsigned long, unsigned long, DLDataType)+0x283) [0x7f888a5ed863]
  [bt] (2) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::runtime::NDArray::Empty(std::vector<long, std::allocator<long> >, DLDataType, DLContext)+0x351) [0x7f888a4a9361]
  [bt] (3) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::runtime::NDArray::CopyTo(DLContext const&) const+0xc0) [0x7f888a4e0560]
  [bt] (4) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::aten::COOMatrix::CopyTo(DLContext const&) const+0x7d) [0x7f888a5cfddd]
  [bt] (5) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::UnitGraph::CopyTo(std::shared_ptr<dgl::BaseHeteroGraph>, DLContext const&)+0x292) [0x7f888a5c0562]
  [bt] (6) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(dgl::HeteroGraph::CopyTo(std::shared_ptr<dgl::BaseHeteroGraph>, DLContext const&)+0xf5) [0x7f888a4f1785]
  [bt] (7) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(+0xcc081b) [0x7f888a4fe81b]
  [bt] (8) /usr/local/lib/python3.6/dist-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7f888a48d228]


gpu is indexing from 0. pls try with g.to('/gpu:0').

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.