I’m wondering what are these masks for?
Does it represent the True fact and False fact? (Triplets and negative triplets?)
or is it created randomly from code like below:
g.edata['train_mask'] = torch.zeros(1000, dtype=torch.bool).bernoulli(0.6)
I picked some triplets from FB15K-237 for examples , here is how I create a heterogeneous graph, is this a proper way?
data_dict = {
('entity', '/travel/travel_destination/climate./travel/travel_destination_monthly_climate/month', 'entity'): (torch.tensor([0]), torch.tensor([1])),
('entity', '/music/performance_role/regular_performances./music/group_membership/group', 'entity'): (torch.tensor([2]), torch.tensor([3])),
('entity', '/location/location/contains', 'entity'): (torch.tensor([4, 4]), torch.tensor([5, 6]))
}
g = dgl.heterograph(data_dict)
with this heterograph, how do I create masks and split data into train-valid-test like builtin dataset?
不知道能不能用中文问,我的英文太差了…
我不了解mask的用意,我在其他教学有看到可用来划分训练集、验证集和测试集,但其tensor是随机产生的
g.edata['train_mask'] = torch.zeros(1000, dtype=torch.bool).bernoulli(0.6)
想请问若以原始的FB15K237数据为例
我该如何创造mask,将数据集弄得跟内建的FB15k237Dataset一样,才可以直接给RGCN里的link.py来使用?
程式新手,问的问题可能很浅白,请见谅,先感谢回覆了,谢谢!