Loss function for GCMC-like recommender system with implicit data

Hi,
I am building a recommender system using GNN and DGL. My graph is heterogeneous : I have 2 types of nodes (‘user’, ‘item’), and 2 types of relations (user - buys - item, item - boughtby - user), but I might add other types of nodes and of relations later on.

I am trying to replicate the GCMC code to build the recommender system. However, my data is implicit, i.e. I only have information about which item a user bought, and no ratings or information about why a user did not buy a given item.

As I understood, GCMC trains the model using a negative log likelihood function. For all the ‘items’ that a ‘user’ has an ‘observed rating’, the loss compares the predicted probability distribution (using a softmax for all the different ratings) and the ground truth probability distribution.

However, if I apply this to implicit data, it would mean that the loss function would only consider ‘observed ratings’, but in this case, all those observed ratings have a rating of 1.

Does anyone encountered a similar issue?

I tried building a max-margin loss function with negative sampling. However, since my number of items is relatively low (<10,000), I would prefer not using negative sampling.

Thanks!

In this case, how about formulating it as a multi-label classification problem? Essentially, for each user you predict a 10000-dimension vector indicating the probability of each item being bought.

Thanks for answering. Yes, I considered this alternative of classification problem.

As a follow-up question, would you have any recommandation on how to batch train it?

I need to predict a 10000-dimension vector for all the users. My prediction function (or bilinear decoder) is a cosinus similarity between the vectors of the user and the item. Thus, I would need to have all my item nodes in a single batch, using the NodeDataLoader.