Hi,
I am building a recommender system using GNN and DGL. My graph is heterogeneous : I have 2 types of nodes (‘user’, ‘item’), and 2 types of relations (user - buys - item, item - boughtby - user), but I might add other types of nodes and of relations later on.
I am trying to replicate the GCMC code to build the recommender system. However, my data is implicit, i.e. I only have information about which item a user bought, and no ratings or information about why a user did not buy a given item.
As I understood, GCMC trains the model using a negative log likelihood function. For all the ‘items’ that a ‘user’ has an ‘observed rating’, the loss compares the predicted probability distribution (using a softmax for all the different ratings) and the ground truth probability distribution.
However, if I apply this to implicit data, it would mean that the loss function would only consider ‘observed ratings’, but in this case, all those observed ratings have a rating of 1.
Does anyone encountered a similar issue?
I tried building a max-margin loss function with negative sampling. However, since my number of items is relatively low (<10,000), I would prefer not using negative sampling.
Thanks!