Hi,
I have built a graph with edge weight named “score”. I want to split the graph for link prediction into train and test.
train_g should have all edges which have edge weights >=0.5
test_g should have all edges which have edge weights <0.5
Also, need to generate the negative samples based on the sizes of the above train_g and test_g.
Would be helpful if you could guide me on how to achieve this. Thanks
Hi, for your question:
-
Dataset Splitting:The DGL API
dgl.data.utils.split_dataset
provides a way to split a dataset. However, it performs a random split. If you need a more specific or custom split, you can manually fetch all edges and filter them according to your requirements. -
Negative Sampling:DGL also provides an API for negative sampling,
dgl.sampling.global_uniform_negative_sampling
. Thenum_samples
argument in this function allows you to specify the number of negative samples to generate for each positive example.