I have data on users and bands they listened to. Best way to create embedding of users and artists?

I am using a Lastfm dataset where there are U users and M artists:

  1. (user, user) edges where an edge exist if there is a mutual following between users
  2. (user, artist) edges where an edge exists if a user listens to an artist.

Goal: I want to represent both users and artists in the same latent space to be able to say how similar an artist is to a user’s tastes.

  1. One approach is to create a utility matrix of dimension of dimension U x M where (u,m) = 1 if user listened to artist and 0 otherwise. Then I can factorize this matrix with something like Alternating Least Squares.

  2. Another approach is to use a model of heterogenous graphs such as metapath2vec, which creates embeddings on heterogenous graphs from taking random walks.


The advantage of the first option is that it’s simpler. However I believe there should be some extension of collaborative filtering that can cover this case.

The advantage of the second option is that it can utilize (user, user) edges.

