I am using a Lastfm dataset where there are U users and M artists:
- (user, user) edges where an edge exist if there is a mutual following between users
- (user, artist) edges where an edge exists if a user listens to an artist.
Goal: I want to represent both users and artists in the same latent space to be able to say how similar an artist is to a user’s tastes.
One approach is to create a utility matrix of dimension of dimension U x M where (u,m) = 1 if user listened to artist and 0 otherwise. Then I can factorize this matrix with something like Alternating Least Squares.
Another approach is to use a model of heterogenous graphs such as metapath2vec, which creates embeddings on heterogenous graphs from taking random walks.