Problems with IdMap in convert_partition.py

cbonn · August 27, 2021, 11:03pm

Hello! I am having an issue with the script convert_partition.py and hoping someone can help. As the full script didn’t work for me right off the bat, I put the convert_partition code into a Jupyter notebook and broke down its individual functionalities into cells. I managed to write nodes.dgl and edges.dgl into their appropriate /part directories. However, when I get to the line

ntype, per_type_ids = id_map(orig_homo_ids)

the notebook fails silently (i.e., stops running the notebook and crashes the kernel). I tried replicating the most basic block of code I could involving IdMap, which appears as follows:

from dgl.distributed.id_map import IdMap

if __name__ == '__main__':
    nid_ranges = {'ntype1': [0, 4], 'ntype2': [5, 6]}
    nid_ranges = {key: np.array(nid_ranges[key]).reshape(1, 2) for key in nid_ranges}
    id_map = IdMap(nid_ranges)
    original_homo_ids = np.array([123, 124, 125, 126, 132, 128, 129])

    id_map(original_homo_ids)```

However, when I run this, I get the following stacktrace: 

```dgl._ffi.base.DGLError: [15:41:51] /tmp/dgl_src/src/graph/graph_op.cc:740: Check failed: it != range_end_data + num_ranges: A bug has been occurred.  Please file a bug report at https://github.com/dmlc/dgl/issues.  Message: 
Stack trace:
  [bt] (0) 1   libdgl.dylib                        0x0000000126487fff dmlc::LogMessageFatal::~LogMessageFatal() + 111
  [bt] (1) 2   libdgl.dylib                        0x0000000126dc5d59 dgl::runtime::NDArray dgl::MapIds<long long>(dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, int, int) + 505
  [bt] (2) 3   libdgl.dylib                        0x0000000126dc51a5 std::__1::__function::__func<dgl::$_14, std::__1::allocator<dgl::$_14>, void (dgl::runtime::DGLArgs, dgl::runtime::DGLRetValue*)>::operator()(dgl::runtime::DGLArgs&&, dgl::runtime::DGLRetValue*&&) + 2229
  [bt] (3) 4   libdgl.dylib                        0x0000000126d70f28 DGLFuncCall + 72
  [bt] (4) 5   core.cpython-38-darwin.so           0x00000001275e556e __pyx_f_3dgl_4_ffi_4_cy3_4core_FuncCall(void*, _object*, DGLValue*, int*) + 958
  [bt] (5) 6   core.cpython-38-darwin.so           0x00000001275e9754 __pyx_pw_3dgl_4_ffi_4_cy3_4core_12FunctionBase_5__call__(_object*, _object*, _object*) + 52
  [bt] (6) 7   Python3                             0x0000000100105dd6 _PyObject_MakeTpCall + 374
  [bt] (7) 8   Python3                             0x00000001001e613c call_function + 652
  [bt] (8) 9   Python3                             0x00000001001e278a _PyEval_EvalFrameDefault + 29962```

Please note that the nid_ranges and original_homo_ids were generated from my notebook code, from the following schema and part files: 

Schema:
```{
    "nid": {
        "ntype1": [0, 4],
		"ntype2": [5, 6]
    },
     "eid": {
        "etype1": [0, 1],
		"etype2": [2, 3]
    }
}```

p000-dgl_test_graph_nodes:
```0 0 0 1 123 1
1 0 0 1 124 1
2 0 0 1 125 2
3 0 0 1 126 2
4 1 0 1 127 2```

p000-dgl_test_graph_edges:
```0 1 123 124 1 0 .1
1 0 124 123 2 0 .2
3 5 126 128 3 0 .3
6 3 129 126 4 1 .4```

Please let me know if I am doing something wrong here, or if there is genuinely a bug in the code as the stacktrace suggests. Thank you in advance!!

zhengda1936 · August 31, 2021, 1:21am

it seems your test graph has 7 nodes in total and you pass homogeneous graph IDs of 123-129. They are out of the range. if your graph has 7 nodes, the homogeneous node IDs are between 0 and 6. any number after 7 will result in the error as you encountered.

cbonn · August 31, 2021, 2:09am

Hi Zhengda! Thanks so much for your response. So to give a bit more context, the original_homo_ids was the output of this code from convert_partition.py:

The original IDs are homogeneous IDs.

# Similarly, we need to add the original homogeneous node IDs
orig_ids = np.concatenate([orig_src_id, orig_dst_id, orig_homo_nid])
orig_homo_ids = orig_ids[idx]

It seems likely that my schema for edges is incorrect, it is essentially:

[“src_id”, “dst_id”, “orig_src_id”, “orig_dst_id”, “orig_type_edge_id”, “edge_type”, “attribute_1”],

and I had been interpreting src_id and dst_id to be the homogeneous ids of the source and destination nodes, and orig_src_id and org_dst_id as the original input IDs of the nodes (i.e., 123-129). Should orig_src_id and orig_dst_id be the homogeneous ids of the src and dst nodes, then? In that case, what are src_id and dst_id in the schema?

cbonn · September 1, 2021, 5:11pm

Hi again! I managed to get my graph to write, but using what I believe to be an incorrect schema. The schema should be:

“src_id”, “dst_id”, “orig_src_id”, “orig_dst_id”, “orig_type_edge_id”, “edge_type”, “timestamp”

My interpretation based on your previous response was that src_id and dst_id should be the homogeneous IDs for the given partition, and that orig_src_id and orig_dst_id should be what I have been calling the “DGL IDs”, which are the reassigned IDs from 0 to n-1 (with n being the number of nodes). These DGL IDs have been reassigned from the original input IDs (i.e., Neo4j IDs) of the nodes. However, the DGL graph will only write/load if I assign BOTH src_id and orig_src_id as the homogeneous IDs for this partition (and both dst_id and orig_dst_id as the homogeneous IDs as well). Do you have any insight as to why this might be?

system · October 1, 2021, 5:12pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.