Link prediction heterogeneous graph split data for testing and trainig

Hi all
I want to split my edges for heterogeneous graph but I don’t know how to split that, I read this(Link Prediction using Graph Neural Networks — DGL 0.6.1 documentation) and I read this(Edges to include in training of link prediction for heterogeneous graphs) but now I don’t know how, I have 9 edge types.
Thanks in advance!

One way is to set the edge masks at the time of graph preparation and filter them like below before training:

train_eid_dict = {etype: (graph.edges[etype].data['mask'] == 1).nonzero(as_tuple=True)[0] for etype in graph.etypes}
val_eid_dict   = {etype: (graph.edges[etype].data['mask'] == 2).nonzero(as_tuple=True)[0] for etype in graph.etypes}

You can pass these dicts into edge dataloaders

2 Likes

Thank you for the help

it is my first work that I want to start in the machine learning world, so clearly don’t know how to begin, I watched the video about link prediction with dgl and read the docs about the hetero graph on-site, these references were great, but for keeping the way I don’t know how to generate them.
Generate the edge masks for all edges?
Another question is about my dataset, in dataset score determine the two nodes have a link or not that score adds to the graph as a feature of edges or as validation edge type idk how to add this data.

It depends on the problem you wish to solve. One typical way is to randomly select a portion (say 90%) of the edges for training and the rest for testing. You then generate the 0-1 mask based on the selection.

1 Like