Node's ID error

Hey there, I am new to DGL library and I couldn’t find a solution for my issue.

I have awkward ID values in my CSV dataset. Such as seen in the image:
sadsadsa

I am getting the error in the image “dgl._ffi.base.DGLError: The num_nodes argument must be larger than the max ID in the data, but got 7 and 2714967881”.

Does the node’s IDs have to be ordered for it to work, or could it be something else? Is there any other requirement for the data loading from a CSV?

I am using the “6_load_data.py” from blitz tutorial to get this error.

You are right that the node IDs have to be consecutive integers from 0 to (number of nodes minus 1).

This can be done with something like:

# assuming that you were using Pandas to load `cells.csv`
ids = cells['Id'].values
ids_map = {k: i for i, k in enumerate(ids)}
1 Like

FYI. CSV dataset will be supported in next DGL release and ID Mapping is handled automatically and internally, though you have to do several modifications to your raw csv files. Please have a try when new DGL release is ready.

1 Like

Can’t wait for this release. Until then I will provide the ordered ID myself for the CSVs.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.