Hello! For context I am working in the multi-CPU and multi-GPU setting for large graphs in which node features are partitioned across machines and mini-batches sent to GPUs for compute. I wanted to follow up on the second question in this post from a year ago about whether node features that have not been partitioned onto a machine can be additionally cached for a period of time, or are node features always disjoint across machines? The answer a year ago was no and I have not found anything in the code/roadmaps/docs saying that has changed or will change but would like to confirm. And perhaps more generally, if the system provides caching at any level that has a notable effect on performance (besides the k-hop neighborhoods of the partitioned graph structure on each machine which I believe is sometimes referred to as extra_cached_hops
) it would be great if I could be pointed to some documentation or quick set of high level notes that I could reference while reading the codebase.
Thank you!