Creating Dataset of Graphs Creation on Multiple CPU

I am trying to create the DGL dataset of graphs similar to as given in this example (Make Your Own Dataset — DGL 1.1.2post1 documentation). The only difference is instead of csv’s i have pandas dataframe but is possible to convert to csv’s or any other format.

I have a lot of small graphs. More than 1-2 million graphs of an average 30 nodes.

Currently it is taking a long time to create those graphs. i do have access to a sagemaker environment where i can have instances with multiple CPUs. I wanted to know possible solutions to speed up the process. I was trying the packae multiprocessing with Pool and was not getting any improvements in time.

Thanks,
Prateek Sasan

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.