I am trying to create the DGL dataset of graphs similar to as given in this example (Make Your Own Dataset — DGL 1.1.2post1 documentation). The only difference is instead of csv’s i have pandas dataframe but is possible to convert to csv’s or any other format.
I have a lot of small graphs. More than 1-2 million graphs of an average 30 nodes.
Currently it is taking a long time to create those graphs. i do have access to a sagemaker environment where i can have instances with multiple CPUs. I wanted to know possible solutions to speed up the process. I was trying the packae multiprocessing with Pool and was not getting any improvements in time.
Thanks,
Prateek Sasan