How to obtain block and node features when iterating dataloader in minbatch training

LouisDong · September 21, 2023, 4:30am

Hello! I use the GCN method to process my image data set (Cifar-10), convert each image into a node, and build a graph through knn. I read the minbatch related documentation, implemented minbatch training, and obtained good results. But I encountered some problems. When building graph data, I obtained all node features through process(), which was very time-consuming and resource-consuming. I wish to obtain the node characteristics of the subgraph while iterating the Dataloader. This is part of my code:

class GraphDataset(DGLDataset):
    super().__init__(name="image2graph")
    def process(self):
        self.get_graph(self.indices)

        images_list, augmented_list, targets_list = self.get_datasets()
        self.graph.ndata['image'] = torch.from_numpy(np.array(images_list))
        self.graph.ndata['label'] = torch.from_numpy((np.array(targets_list)))

    def __getitem__(self, i):
        return self.graph

    def __len__(self):
        return self.graph.num_nodes()

train_graph_dataset = GraphDataset()
sampler = dgl.dataloading.MultiLayerFullNeighborSampler(1)
train_graph_dataloader = dgl.dataloading.DataLoader(train_graph_dataset.graph,
                                                        torch.arange(len(train_graph_dataset)),
                                                        sampler, num_workers=p['num_workers'],
                                                        batch_size=p['batch_size'],
                                                        drop_last=True, shuffle=True)
# training
for i, (input_nodes, output_nodes, blocks) in enumerate(train_loader):
        images = blocks[0].srcdata['image'].cuda(non_blocking=True)
        ...

I will appreciate some advice. 十分感谢。

frozenbugs · September 26, 2023, 12:57pm

Do you mean leave the node feature on disk and fetch the related feature for each subgraph during the iteration?

If so, we don’t have this type of support on dgl, but good news is, the ongoing graphbolt work aims to provide this type of support, see the example here. The work is almost done and we are in the process of wrapping up and providing documentation. If you want to try it, feel free to use our nightly build and try the alpha version.

system · October 26, 2023, 12:58pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.