Hi there!
I am using the EdgeDataLoader
to generate Mini-batches from a training_graph
(DGLHeteroGraph
) which I define as the train_loader
. When the outputs from the train_loader
should be accessed in this line here:
for input_nodes, positive_graph, blocks in train_loader:
...
I am getting an error as follows:
File "/path", line 337, in train
for input_nodes, positive_graph, blocks in train_loader:
File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
data = self._next_data()
File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "./lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 662, in collate
return self._collate(items)
File "./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 563, in _collate
items = {k: F.zerocopy_from_numpy(np.asarray(v)) for k, v in items.items()}
File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 563, in <dictcomp>
items = {k: F.zerocopy_from_numpy(np.asarray(v)) for k, v in items.items()}
File "/./lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py", line 307, in zerocopy_from_numpy
return th.as_tensor(np_array)
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.
A code snippet I am using is the following:
train_eid_dict = {
canonical_etype: torch.arange(training_graph.num_edges(canonical_etype[1]), dtype=torch.int64).to(device)
for canonical_etype in training_graph.canonical_etypes
}
train_loader = dgl.dataloading.EdgeDataLoader(
g=heterograph.g,
eids=train_eid_dict,
block_sampler=sampler,
batch_size=batch_size,
g_sampling=training_graph,
)
for input_nodes, positive_graph, blocks in train_loader:
blocks = [b.to(torch.device('cuda')) for b in blocks]
positive_graph = positive_graph.to(torch.device('cuda'))
negative_graph = negative_graph.to(torch.device('cuda'))
input_features = {
ntype: blocks[0].srcdata[ntype]
for ntype in blocks.ntypes
}
edge_labels = edge_subgraph.edata['labels']
edge_predictions = model(edge_subgraph, blocks, input_features)
loss = compute_loss(edge_labels, edge_predictions)
opt.zero_grad()
loss.backward()
opt.step()
Could you help me why that happens? I already specified the value of train_eid_dict
to be a torch array with type = torch.int64 here, so I am unsure why it gives me the numpy.object_ error.
torch.arange(training_graph.num_edges(canonical_etype[1]), dtype=torch.int64).to(device)