I have a big training set of about hundreds of independent graphs, with around 406660 nodes each. I’m trying to use dgl.batch to batch the different independent graphs, and then use dgl.contrib.sampling.NeighborSampler to randomly sample some of the nodes and their neighbourhood, since my independent graphs are pretty large. However, I get this error message:
Traceback (most recent call last): File "train_model.py", line 169, in <module> app.run(train) File "/home/cc91/.local/lib/python3.7/site-packages/absl/app.py", line 299, in run _run_main(main, args) File "/home/cc91/.local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "train_model.py", line 123, in train expand_factor = expand_factor): File "/home/cc91/.local/lib/python3.7/site-packages/dgl/contrib/sampling/sampler.py", line 320, in __init__ ThreadPrefetchingWrapper) File "/home/cc91/.local/lib/python3.7/site-packages/dgl/contrib/sampling/sampler.py", line 154, in __init__ raise NotImplementedError("This loader only support read-only graphs.") NotImplementedError: This loader only support read-only graphs.
However, as I understand the batched graphs are already readonly. Anyone has an idea of what is going on?
This is the part of the code concerned,
optimizer = torch.optim.Adam(model.parameters(), lr=config_dict['learning_rate'], ) criterion = nn.MSELoss() print("Start training...") model.train() start = time.time() for epoch in range(FLAGS.n_epochs): loss_list =  for batch, data in enumerate(train_dataloader): graph, labels = data print(type(graph)) for nf in dgl.contrib.sampling.NeighborSampler(graph, minibatch_size, expand_factor = expand_factor): nf.copy_from_parent() nf.ndata['features'] = nf.ndata['features'].to(device) labels = labels.to(device) logits = model.forward(nf, nf.ndata['features']) loss = criterion(logits, labels)
The output of print(type(graph)) is <class ‘dgl.batched_graph.BatchedDGLGraph’>, which seems to be correct.