I have a training dataset with shape N*F, N denotes the training sample number and F denotes the the number of fields. There is no structure information in the dataset so I want to construct a complete graph (F nodes) for each training sample. The node embedding can be obtained from a torch.nn.Embedding. This case is common. For example, if F fields denote F modalities, then this is the same case as discussed in Self attention and feature fusion over graphs - #3. In addition, the constructed graph is a complete graph and the structure is the same for all training samples (X_i, i=1,…,N). I also found a topic discuss this: Batch same structured graph.

Basically, I have two ways to load the dataset into DGL. The first way is to construct all graphs in advance and then use Dataloader to generate batch graphs. The second way is to construct graphs when loading batches.

(I write the code in PYG manner, but I think it is the same in DGL)

More specifically, the following is the first method:

```
class GraphDataset(torch_geometric.data.Dataset):
def __init__(self, trainingData):
super().__init__(trainingData)
self.trainingData = trainingData
self.size = self.trainingData.shape[0] # N
self.num_features = self.trainingData.shape[1] # F
self.src_nodes, self.dst_nodes = zip(*list(product(range(self.num_fields), repeat=2))) # Complete graph edge index
self.graphs = [Data(x = darray[idx], edge_index = torch.tensor([list(self.src_nodes),list(self.dst_nodes)], dtype=torch.long)) for idx in range(self.size)]
# Data object is from PYG, It can be replaced with DGL.graph object in the same way.
# However, when N is large, this will case OOM.
def get(self,index):
graphs = self.graphs[index]
y = self.trainingData[index, -1]
return graphs, y
```

The second method:

```
class GraphDataset(torch_geometric.data.Dataset):
def __init__(self, trainingData):
super().__init__(trainingData)
self. trainingData = trainingData
self.size = self. trainingData.shape[0] # N
self.num_features = self. trainingData.shape[1] # F
def get(self,index):
X = self.trainingData[index, :]
src_nodes, dst_nodes = zip(*list(product(range(self.num_features), repeat=2))) # Complete Graph edge index
graphs = Data(x = torch.tensor(X), edge_index = torch.tensor([list(src_nodes),list(dst_nodes)], dtype=torch.long))
# We only process batch data here, but it is very time comsuming.
y = self.trainingData[index, -1]
return graphs, y
```

When N is large, the first method is out-of-memory, and the second method is very slow. Slow means that when I set batch size as 2048 and F set as 24, generating **graph object** (i.e., PYG.Data or DGL.Graph) will cause additional 20 minutes overhead. (If I just use non-graph data, one epoch costs 35min. If I generate **graph object** and use *Dataloader workers* and *Pin_Memory*, it will costs 55min. If I do not use *Dataloader workers*, it will costs 2 hours! In fact, I am very confused that generating 2048 **graph objects** costs so much time.)

Is there are any memory/computation friendly way or tricks for handling batches of complete graphs? In addition, I think allocate each training samples a compete graph and store the (training sample, graph) in the disk is not applicable since we still need to load the whole dataset into memory and it will become the same as the first method.

**Edit**: I just replace the PYG with the DGL and utilizing the **dgl.dataloading.GraphDataLoader**:

```
class GraphDataset(dgl.data.DGLDataset):
def __init__(self, trainingData):
super().__init__(trainingData)
self. trainingData = trainingData
self.size = self. trainingData.shape[0] # N
self.num_features = self. trainingData.shape[1] # F
def __getitem__(self,index):
X = self. trainingData[index, 0:-1]
graph = dgl.graph( (torch.tensor(list(self.src_nodes), dtype=torch.long),torch.tensor(list(self.dst_nodes), dtype=torch.long)) )
graph.ndata['x'] = torch.tensor(X)
y = torch.tensor(self. trainingData[index, -1])
return graph, y
```

It is as fast as that not using **graph object** now! It surprised me a lot! I used to think that It may be the same for PYG and DGL.