Hi,
I’m using DGL and PyTorch on a protein dataset that includes 100K + structures (which would then be converted to 100K+ graphs). The dataset does not fit in memory so ideally I’d like to save parts of it and then load them later. I’m using DGL’s dataloader for that.
Is there a way I can progressively save parts of my dataset or should I just manually produce batches and load them at training time?
thanks in advance.