Hello,
I am training an MLP using minibatching and I have an unbalanced dataset and was hoping to oversample the minority class and downsample the majority class to be able to train an MLP.
For the dataloader I have defined a graph sampler and I was hoping to pass a WeightedRandomSampler as a Key-word arguments to be passed to the parent PyTorch Dataloader.
From the error i am seeing i understand this isn’t possible because the DGL dataloader is generating an iterable dataset. Are are other ways to do oversampling of minority class for DGL?
ValueError: DataLoader with IterableDataset: expected unspecified sampler option, but got sampler=<torch.utils.data.sampler.WeightedRandomSampler object at 0x7fe4fe719d90>
#graph sampler
graph_sampler = NeighborSampler([15, 10, 5], prefetch_node_feats=['h'])
graph_sampler = as_edge_prediction_sampler(graph_sampler)
#sampler for parent pytorch dataloader that takes as input each sample probability (sample_weight)
sampler = torch.utils.data.sampler.WeightedRandomSampler(sampler_weight.type('torch.DoubleTensor'), len(sampler_weight), replacement=True)
use_uva = (args.mode == 'mixed')
#DGL dataloader with graph_sampler and sampler
dataloader = DataLoader(
g, train_eids, graph_sampler,
device=device, batch_size=8, shuffle=True,
drop_last=False, num_workers=0, use_uva=use_uva,sampler=sampler)
Thanks!