Asking for the reason why the sample time that may exist in ogbn-products is too long

yichuan520030910320 · March 13, 2023, 8:37am

Although this problem may not be easy to solve, but I still hope that someone can help(or give a hint) me with this answer. The question is why the sample process of ogbn-products is much longer than ogbn-papers100M(about 5 times). I only change the dataset name and here is my code GNN_acceleration/profile_manual_pin_CPUGPU.py at main · yichuan520030910320/GNN_acceleration · GitHub I will give my result the sample process of ogbn-products takes 100ms or so but the sample process of ogbn-papers100m takes 23ms but the sampled node size is nearly the same between two dataset.

I use tensorboard profile and find the main difference between them is the fuction of

subgidx = _CAPI_DGLSampleNeighbors(g._graph, nodes_all_types, fanout_array,                                        edge_dir, prob_arrays, excluded_edges_all_t, replace)

in dgl/dataloading/neighbor_sampler.py but the problem is papers100M is much bigger than products and I am confused about this issue.

you can run python profile_manual_pin_CPUGPU.py --dataset ogbn-products or python profile_manual_pin_CPUGPU.py --dataset ogbn-papers100M to reproduce the result

Thanks in advance if some one can help me!

peizhou001 · March 13, 2023, 10:30am

I can’t give the direct answer, but hope these suggestions may help:

Extract minimum code for sampling profiling.
Use py-spy with -n to get the flamegraph containing C++ code stack for analysis.

system · April 12, 2023, 10:30am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.