How to Get Different Splits for Cross-Validation


Could you please tell me how to conduct cross-validation with different splits?

I noticed in the GCN paper that the authors conducted a 5-fold cross-validation experiment with 5 different train/test split.

However, the train/validation/test split provided by DGL seems to be fixed. So do we need to split the datasets manually if we want to conduct the cross-validation, or can this be simply achieved by DGL?

If we need to manually achieve this, could you please tell me how the datasets should be split? I did not find any detail in the paper, eg. how many train/test points in total, how many train/test points per category, etc.

Thank you very much!

Currently we are using binary masks to represent the membership of nodes for the training/validation/test set. See L30, dgl GCN example.

For details about how cross validation was performed, I recommend emailing Thomas directly.