EdgeDataLoader: TypeError: can’t convert np.ndarray of type numpy.object

sopkri · October 2, 2020, 9:34am

Hi there!

I am using the EdgeDataLoader to generate Mini-batches from a training_graph (DGLHeteroGraph) which I define as the train_loader. When the outputs from the train_loader should be accessed in this line here:

for input_nodes, positive_graph, blocks in train_loader:
    ...

I am getting an error as follows:

File "/path", line 337, in train
    for input_nodes, positive_graph, blocks in train_loader:
  File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/./lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "./lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 662, in collate
    return self._collate(items)
  File "./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 563, in _collate
    items = {k: F.zerocopy_from_numpy(np.asarray(v)) for k, v in items.items()}
  File "/./lib/python3.7/site-packages/dgl/dataloading/dataloader.py", line 563, in <dictcomp>
    items = {k: F.zerocopy_from_numpy(np.asarray(v)) for k, v in items.items()}
  File "/./lib/python3.7/site-packages/dgl/backend/pytorch/tensor.py", line 307, in zerocopy_from_numpy
    return th.as_tensor(np_array)
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.

A code snippet I am using is the following:

train_eid_dict = {
        canonical_etype: torch.arange(training_graph.num_edges(canonical_etype[1]), dtype=torch.int64).to(device)
        for canonical_etype in training_graph.canonical_etypes
    }
train_loader = dgl.dataloading.EdgeDataLoader(
        g=heterograph.g,
        eids=train_eid_dict,  
        block_sampler=sampler,
        batch_size=batch_size,
        g_sampling=training_graph, 
    )
    for input_nodes, positive_graph, blocks in train_loader:
        blocks = [b.to(torch.device('cuda')) for b in blocks]
        positive_graph = positive_graph.to(torch.device('cuda'))
        negative_graph = negative_graph.to(torch.device('cuda'))

        input_features = {
            ntype: blocks[0].srcdata[ntype]
            for ntype in blocks.ntypes
        }
        edge_labels = edge_subgraph.edata['labels']
        edge_predictions = model(edge_subgraph, blocks, input_features)
        loss = compute_loss(edge_labels, edge_predictions)
        opt.zero_grad()
        loss.backward()
        opt.step()

Could you help me why that happens? I already specified the value of train_eid_dict to be a torch array with type = torch.int64 here, so I am unsure why it gives me the numpy.object_ error.

torch.arange(training_graph.num_edges(canonical_etype[1]), dtype=torch.int64).to(device)

mufeili · October 2, 2020, 12:31pm

Can you provide a complete code snippet for reproducing the issue? If you are not comfortable with sharing your graph data, can you share a subgraph that yields a same issue?

sopkri · October 2, 2020, 1:01pm

@mufeili I have saved the training graph (DGLHeteroGraph) in a pickle file and uploaded it to GitHub. You can find it under this link here. Then the train_eid_dict and the train_loader are defined as above.

mufeili · October 2, 2020, 1:23pm

I assume this is for training_graph?
What is heterograph.g?
How did you initialize sampler?
What is the batch_size you are using?
Are you aware that DGL is providing save_graphs/load_graphs for saving/loading DGLGraph objects? It should be more efficient than pickle.

sopkri · October 2, 2020, 1:37pm

Yes exactly, this train_loader is for the training_graph
Actually, this is the entire graph, but in this case you can just pass the training graph from the file here and not use the g_sampling to specify the training graph
The sampler is initialised with

fanout = 4
n_layers = 3
sampler = dgl.dataloading.MultiLayerNeighborSampler([fanout] * n_layers)

batch_size = 512
I used pickle since this is a heterograph and in this issue this is the recommended way to do for now

mufeili · October 3, 2020, 11:44am

I tried

for input_nodes, positive_graph, blocks in train_loader:
    pass

but I am not able to reproduce your issue.

As of DGL 0.5, there is no difference between homogeneous and heterogeneous graphs and you can use save_graphs/load_graphs for heterogeneous graphs as well.

sopkri · October 6, 2020, 8:53am

@mufeili I was able to figure out the error. It had to do with the generation of the training graph, since I put in a list of nodes (with multiple occurrences of a node) instead of a set of nodes. Problem solved!