I have built a graph with edge weight named “score”. I want to split the graph for link prediction into train and test.
train_g should have all edges which have edge weights >=0.5
test_g should have all edges which have edge weights <0.5
Also, need to generate the negative samples based on the sizes of the above train_g and test_g.
Would be helpful if you could guide me on how to achieve this. Thanks
Hi， for your question：
Dataset Splitting:The DGL API
dgl.data.utils.split_datasetprovides a way to split a dataset. However, it performs a random split. If you need a more specific or custom split, you can manually fetch all edges and filter them according to your requirements.
Negative Sampling:DGL also provides an API for negative sampling,
num_samplesargument in this function allows you to specify the number of negative samples to generate for each positive example.