I am training an MLP using minibatching and I have an unbalanced dataset and was hoping to oversample the minority class and downsample the majority class to be able to train an MLP.
For the dataloader I have defined a graph sampler and I was hoping to pass a WeightedRandomSampler as a Key-word arguments to be passed to the parent PyTorch Dataloader.
From the error i am seeing i understand this isn’t possible because the DGL dataloader is generating an iterable dataset. Are are other ways to do oversampling of minority class for DGL?
ValueError: DataLoader with IterableDataset: expected unspecified sampler option, but got sampler=<torch.utils.data.sampler.WeightedRandomSampler object at 0x7fe4fe719d90>
#graph sampler graph_sampler = NeighborSampler([15, 10, 5], prefetch_node_feats=['h']) graph_sampler = as_edge_prediction_sampler(graph_sampler) #sampler for parent pytorch dataloader that takes as input each sample probability (sample_weight) sampler = torch.utils.data.sampler.WeightedRandomSampler(sampler_weight.type('torch.DoubleTensor'), len(sampler_weight), replacement=True) use_uva = (args.mode == 'mixed') #DGL dataloader with graph_sampler and sampler dataloader = DataLoader( g, train_eids, graph_sampler, device=device, batch_size=8, shuffle=True, drop_last=False, num_workers=0, use_uva=use_uva,sampler=sampler)