Could you give more examples of self-designed MultiLayerDropoutSampler

You give a very simple example of the dropout method with possibility p


But in some cases, the sampler is more difficult. For example, I saw this paper NeuralSprase(ICML 2020) and wanted to implement it with DGL, but I am new to it. The key idea in this paper would be something like this:

  1. Calculate the score on each edge with MLP.
    截屏2021-10-19 下午11.04.10
  2. For each node, do softmax on all its input edges.
    截屏2021-10-19 下午11.05.15
  3. For each node, samples k neighbors with gumbel softmax.
    截屏2021-10-19 下午11.07.06

The process is list as below:

How do I implement this in an efficient way? I think there are many difficulties.

  1. Could I give more parameters to function sample_frontier? Which codes should i revise in some other place??
  2. How do I conduct softmax operations in an efficient way? The simple example above only masks edges with p, but i want to sample edges for each node with its probability distribution on its neighbors as input. It seems very time-consuming by traversing each node one by one.
  3. How do the network in the sampler be trained with the loss function.

Thank you for your continued help and support and look forward to your reply!!!

I doubt if NeuralSparse is indeed suitable for large-scale graphs since it computes a score for every edge in the original graph before taking Gumbel-Softmax. It also says that the complexity of sampling is O(m) where m represents the total number of edges in the graph. The point of BlockSampler though is to avoid computation on all edges.

1 Like

The key idea of this paper is to sample block with a neural model. I also think the algorithm is time-consuming and memory-unfriendly. The parameters in MLP are trained for every batch and the probabilities of all edges should be re-calculated with the new updated MLP?
But in some cases, learning to choose some important edges in a small graph is also importantxERTE

I have implemented a simple 1-layer sampler with a neural network. But I do not know how to norm the weight (It seems function sample_neighbors do not need to normalize the weight???) Last but not least, I do not know how to update the parameters in the sampler!!! Please help me or give me some suggestions???

class MultiLayerDropoutSampler(dgl.dataloading.BlockSampler):
    def __init__(self, num_ents, num_rels, h_dim, fanout, num_layers):
        super().__init__(num_layers)
        self.max_time = max_time
        self.h_dim = h_dim
        self.ent_embs = nn.Embedding(num_ents, self.h_dim)
        self.rel_embs = torch.nn.Parameter(torch.Tensor(num_rels, self.h_dim), requires_grad=True).float()
        torch.nn.init.xavier_normal_(self.rel_embs)

        self.weight = torch.nn.Parameter(torch.Tensor(3*self.h_dim, 1), requires_grad=True).float()
        torch.nn.init.xavier_normal_(self.weight)
        self.fanout = fanout

    def sample_frontier(self, block_id, g, seed_nodes, *args, **kwargs):
        # 获取种 `seed_nodes` 的所有入边
        subg = dgl.in_subgraph(g, seed_nodes, store_ids=True)
        
        src, dst = subg.edges()
        rel = subg.edata['type']
     
        src_ent = src // (self.max_time+1)
        dst_ent = dst // (self.max_time+1)

        src_embs = self.ent_embs(src_ent)
        dst_embs = self.ent_embs(dst_ent)
        rel_embs = self.rel_embs[rel, :]

        h = torch.cat([src_embs, rel_embs, dst_embs], dim=1)
        weight = torch.sigmoid(torch.mm(h, self.weight))
        subg.edata['prob'] = weight        
        frontier = dgl.sampling.sample_neighbors(subg, seed_nodes, self.fanout, prob='prob')
        return frontier

    def __len__(self):
        return self.num_layers

After some investigation I figured out that to truly learn a trainable sampler, the operation sample_neighbors itself should support gradients, which is currently not supported. Supporting that would likely require writing a segmented gumbel softmax kernel. Overall, that’s a reasonable feature request and we will try to plan it in our future releases.