 Hello there!
I’m trying to implement the LADIES sampler.
Essentially, the way this method works is iteratively sampling the nodes in each message passing layer based on a probability associated with its connectivity to the nodes in the previously selected layer. I’m having trouble doing this using the current DGL library (I have v8.2).

Following the GitHub implementation of the paper’s original authors, the process for selecting the probabilities for one layer should go as follows:

``````U = lap_matrix[seed_nodes, :]
pi = np.array(np.sum(U.multiply(U), axis=0))
p = pi / np.sum(pi)
s_num = np.min([np.sum(p > 0), layer_samples])
after_nodes = np.random.choice(g.num_nodes(), s_num, p = p, replace = False)
``````

this gives a vector p of probabilities associated with the likelihood, p[i], of a node, i, being selected in this layer. We can then use these probabilities to select nodes for the following layer. Thus, in order to build a frontier (for a single layer), I was following the following procedure:

``````nodes = torch.unique(torch.concat((torch.tensor(after_nodes), seed_nodes)))
sg = dgl.node_subgraph(g, nodes)
``````

However, manually following this process doesn’t seem to be the most efficient and it doesn’t seem to work as I expect it to because it’s not properly formulated as a block (and throws errors when I try to use `dgl.to_block(sg, dst_nodes=seed_nodes, src_nodes=after_nodes)`.

Intuitively, I think I would like to try to use `dgl.sampling.sample_neighbors()` but this provides it’s own challenges:

1. We need to associate probabilities with edges but, since I make the graph undirected via `dgl.to_birected()` these are doubled and associating a probability with two different eids seems cumbersome and inefficient.
2. The purpose of this probability is that it can be adaptive. That is, with each new layer, the probabilities are updated via the dst nodes of the previous layer. It seems like continuously updating all the edge data in a graph can also be inefficient.