Hi @thegadfly thanks for providing this to us! I did some tests against same graphs over DGL and PYG respectively, the result shows that PYG is about 4 times faster
than DGL:
DGL
Graph number |
Size in disk(Mb) |
Load time(Seconds) |
10W |
105 |
42 |
100W |
1047 |
422 |
500W |
5235 |
2067 |
PYG
Graph number |
Size in disk(Mb) |
Load time(Seconds) |
10W |
105 |
9.72 |
100W |
1057 |
95.53 |
500W |
5246 |
473.26 |
But in one scenario they performs like you reported — source graphs contains many repeated graphs, in that case PYG has a higher compression ratio so occupy a little disk and has much faster load speed, E.G.
PYG with a vector of repreated graphs
Graph number |
Size in disk(Mb) |
Load time(Seconds) |
10W |
0.2 |
0.009 |
100W |
2 |
0.084 |
500W |
10 |
0.343 |
Above all, We will investigate more to see where the gap happens. And could you provide more information about your dateset to see if there contains many repeated graphs?