In the explicit case it is a choice. You could fill in with all zeros (but it will heavily bias your training data). Alternatively you can compute your updates only when R_{ij} is known (but there is more complex to code and slower to run).

In the implicit case you would fill with zeros (and use your confidence factor to control for bias)

For actual practical use-cases I would not recommend implementing this from scratch yourself. Use a library like https://github.com/benfred/implicit or Spark’s ALS implementation

]]>I’ve one question, to find the latent vectors we frame it as an optimisation problem using a standard squared loss. to minimize the squared loss, we need to do Rij – Xi * Yj, where Rij represents the raw rating in the user-item rating matrix, but what we all know is that not every records in user-item rating matrix is available, there are always many missing ratings, for example, user i does not rate item j, then Rij is missing. then how can we deal with missing Rij? fill missing Rij with 0 or with some randon numbers? or just compute Rij – Xi * Yj with available Rij and do not do any calculation with missing Rij? ]]>

Thanks for great post. Can you cite a source on how to embed User and Items master data (attributes of Users and Items) as an additional “latent factors” to make the recommendation somehow better ?

Thanks.

]]>https://raw.githubusercontent.com/jtopor/CUNY-MSDA-643/master/FP/artists.dat

]]>