Hello! I’m developing software that measures the performance of inferences on different frameworks. And actually I have a question: are there any tools for parallelism specifically inference, not for training?
For example, in PyTorch it is possible to run multiple threads on the GPU in image inference.
Perhaps there are some other parallelism tools in DGL? Only interested in inference, not training