The number of nodes after partitioning is not the same as the number of nodes during training.
which api is this function implemented by, and Is it possible not to use this feature?
The following is the output of the training process:
Splitting the ogb-product dataset into two partitions
part 0 has 1488953 nodes and 1198163 are inside the partition
part 0 has 62170974 edges and 60484815 are inside the partition
part 1 has 1514106 nodes and 1250866 are inside the partition
part 1 has 64919624 edges and 63233465 are inside the partition
Then we start training and find
ubuntu, part 1, train: 98307 (local: 98307), val: 19661 (local: 19621), test: 1106545 (local: 1106545)
ubuntu2, part 0, train: 98308 (local: 96028), val: 19662 (local: 19662), test: 1106546 (local: 1082433)
After rebalance, the number of nodes in the training, validation, and test sets on the two workers is relatively close, with a difference of only 1 node
part 1:
local: 98307 + 19621 + 1106545 = 1224473
98307 + 19661 + 1106545 = 1224513
1224473 != 1250866
1224513 != 1514106
part 0:
local: 96028 + 19662 + 1082433 = 1198123
98308 + 19662 + 1106546 = 1224516
1198123 != 1198163
1224516 != 1488953