Link prediction heterogeneous graph split data for testing and trainig

asat · April 21, 2021, 2:40pm

Hi all
I want to split my edges for heterogeneous graph but I don’t know how to split that, I read this(Link Prediction using Graph Neural Networks — DGL 0.6.1 documentation) and I read this(Edges to include in training of link prediction for heterogeneous graphs) but now I don’t know how, I have 9 edge types.
Thanks in advance!

Sriharsha · April 22, 2021, 2:17am

One way is to set the edge masks at the time of graph preparation and filter them like below before training:

train_eid_dict = {etype: (graph.edges[etype].data['mask'] == 1).nonzero(as_tuple=True)[0] for etype in graph.etypes}
val_eid_dict   = {etype: (graph.edges[etype].data['mask'] == 2).nonzero(as_tuple=True)[0] for etype in graph.etypes}

You can pass these dicts into edge dataloaders

asat · April 22, 2021, 11:07am

Thank you for the help

asat · April 22, 2021, 6:21pm

it is my first work that I want to start in the machine learning world, so clearly don’t know how to begin, I watched the video about link prediction with dgl and read the docs about the hetero graph on-site, these references were great, but for keeping the way I don’t know how to generate them.
Generate the edge masks for all edges?
Another question is about my dataset, in dataset score determine the two nodes have a link or not that score adds to the graph as a feature of edges or as validation edge type idk how to add this data.

minjie · April 27, 2021, 7:26am

It depends on the problem you wish to solve. One typical way is to randomly select a portion (say 90%) of the edges for training and the rest for testing. You then generate the 0-1 mask based on the selection.

system · May 27, 2021, 7:26am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.