why not use other CUDA stream like sparsePull method directly ?
Both of them are nullptr I think, so there’s no difference here. as the comments said, it’s TODO yet. Maybe it should be replaced by
thx, I will try unit test it in 8gpu-v100