NegativeSampler in GraphSage's example unsupervised training

charlieyun · September 25, 2020, 5:30pm

In the NegativeSampler class,

class NegativeSampler(object):
    def __init__(self, g, k, neg_share=False):
        self.weights = g.in_degrees().float() ** 0.75
        self.k = k
        self.neg_share = neg_share

    def __call__(self, g, eids):
        src, _ = g.find_edges(eids)
        n = len(src)
        if self.neg_share and n % self.k == 0:
            dst = self.weights.multinomial(n, replacement=True)
            dst = dst.view(-1, 1, self.k).expand(-1, self.k, -1).flatten()
        else:
            dst = self.weights.multinomial(n*self.k, replacement=True)
        src = src.repeat_interleave(self.k)
        return src, dst

Can someone help me understand what the __call__ method doing? If my understanding ins correct, self.weights.multinomial() only returns the dst nodes of the edges with a multinomial distribution. How does it gets negative dst nodes in here?

Thank you!

mufeili · September 27, 2020, 6:17pm

self.weights here is a power of the in-degrees of all nodes in the graph. With self.weights.multinomial(n, replacement=True), it constructs a multinomial distribution over all nodes in the graph, with its probability mass function obtained by normalizing self.weights, i.e. a node with a larger in-degree has a higher chance to be sampled. With ...(n, replacement=True), it samples n nodes with replacement. The sampled nodes serve as “fake” dst for the src, hence forming negative edges.

charlieyun · September 27, 2020, 10:47pm

How does it make sure that it is a “fake” dst for src since it is only sampling nodes based on a multinomial distribution of in-degrees? There could be a chance that the sampled dst is the actual dst for src right?

BarclayII · September 28, 2020, 7:27am

That’s true, but mathematically including the “fake” dst and excluding the “fake” dst are equivalent. In practice they also make little difference in e.g. word embedding learning. Also if the graph is large then the probability of getting “real” dst in negative examples is quite low, so it does not matter as well.

charlieyun · September 28, 2020, 9:52pm

Hmm that makes sense. However my graphs have only 265 nodes but at least 10k edges so it’s a “large” graph as in it is a dense graph. Is there a way to exclude the "real"dst nodes? If my memory serves me right, in the now deprecated EdgeSampler, it is able to exclude those dst nodes.

BarclayII · September 29, 2020, 8:43am

If your graph is small, an alternative will be including a mask on the neg_graph generated by the EdgeDataLoader in question. Once you get the neg_graph you can find a boolean mask array indicating whether the edges in the neg_graph exists in the original graph. This is quite easy to do actually:

neg_src, neg_dst = neg_graph.edges()
exists = g.has_edges_between(neg_src, neg_dst)

Then in loss computation you can simply remove the loss terms with exists as True. For instance, CrossEntropyLoss:

neg_score = neg_score[~exists]