CUDA Kernel for GPU/UVA Sampling

pranjaln · October 1, 2024, 11:29am

Hi all. I would like to get a detailed understanding of memory transfers (D2D, H2D, and D2H) which take place during sampling by the DGL dataloader. To do so, I need help identifying the CUDA kernel(s) responsible for GPU/UVA based sampling. Could someone point me to the kernels I need to look at, so that I can profile the time taken for memory operations?

minjie · October 10, 2024, 4:45am

The whole pipeline consists of multiple kernels so it’s not easy to pinpoint to exactly which could be the bottleneck. My suggestion is to use profiling tools like NVVP for a holistic understanding.

system · November 9, 2024, 4:46am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.