Pin Memory Handling in Dataloader

e-yi · February 6, 2024, 3:53pm

Hi,

I’ve come across a few points while working on the dataloading pipeline, and I would really appreciate some guidance:

A standard PyTorch DataLoader requires a customized class to have a method named pin_memory to utilize it.
The dgl.DGLGraph has a method called pin_memory_ but not pin_memory.
The dgl.dataloading.dataloader.GraphDataLoader class doesn’t seem to do anything particular with this.

I’m currently working with small graphs, and my approach involves batching the graphs in a customized collate_fn of torch.utils.data.DataLoader. Now, I’m wondering if I should call batched_graph.pin_memory_() before returning it, or should I just return features separately (like return batched_graph, node_feature, edge_feature, graph_feature)? What would be the best practice that you recommend?

Rhett-Ying · February 7, 2024, 1:51am

please call pin_memory_() before returning it like below:

    def collate_fn(samples):
        graphs, labels = map(list, zip(*samples))
        return dgl.batch(graphs).pin_memory_(), torch.tensor(labels).pin_memory()

    data_loader = dgl.dataloading.GraphDataLoader(
        minigc_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn
    )

system · March 8, 2024, 1:51am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.