Hi everyone,
I have a big training set of about hundreds of independent graphs, with around 406660 nodes each. I’m trying to use dgl.batch to batch the different independent graphs, and then use dgl.contrib.sampling.NeighborSampler to randomly sample some of the nodes and their neighbourhood, since my independent graphs are pretty large. However, I get this error message:
Traceback (most recent call last):
File "train_model.py", line 169, in <module>
app.run(train)
File "/home/cc91/.local/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/cc91/.local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "train_model.py", line 123, in train
expand_factor = expand_factor):
File "/home/cc91/.local/lib/python3.7/site-packages/dgl/contrib/sampling/sampler.py", line 320, in __init__
ThreadPrefetchingWrapper)
File "/home/cc91/.local/lib/python3.7/site-packages/dgl/contrib/sampling/sampler.py", line 154, in __init__
raise NotImplementedError("This loader only support read-only graphs.")
NotImplementedError: This loader only support read-only graphs.
However, as I understand the batched graphs are already readonly. Anyone has an idea of what is going on?
This is the part of the code concerned,
optimizer = torch.optim.Adam(model.parameters(),
lr=config_dict['learning_rate'],
)
criterion = nn.MSELoss()
print("Start training...")
model.train()
start = time.time()
for epoch in range(FLAGS.n_epochs):
loss_list = []
for batch, data in enumerate(train_dataloader):
graph, labels = data
print(type(graph))
for nf in dgl.contrib.sampling.NeighborSampler(graph,
minibatch_size,
expand_factor = expand_factor):
nf.copy_from_parent()
nf.ndata['features'] = nf.ndata['features'].to(device)
labels = labels.to(device)
logits = model.forward(nf, nf.ndata['features'])
loss = criterion(logits, labels)
The output of print(type(graph)) is <class ‘dgl.batched_graph.BatchedDGLGraph’>, which seems to be correct.