github twitter instagram linkedin email rss
Recommending Movies to Myself
Feb 5, 2017
4 minutes read

Winter break is a perfect time to do some binge-watching or movie marathon before the new semester starts.

But the problem is, I can’t decide what movies to watch most of the time. I subscribe to Netflix and Amazon Prime, but I feel their recommendation doesn’t fit me well. I often spend 10 minutes just to scroll through their recommendation and end up watching Youtube.

Because of this one course I took last semester, I had to spend weeks reading papers about recommendation system. Then I was thinking, what if I try to use Koren’s SVD++ to recommend me movies. Five minutes later, I skimmed through Koren’s paper again, even though last semester I’ve promised myself to not open this paper again (jk).

The SVD++

svd++

The main purpose of SVD is to transform items and users features to the same latent factor space so that they’re directly comparable. In Koren’s SVD, he included implicit feedback and the neighbourhood terms to the model.

What the model does is to predict the rating r of user u to item i. The first part tells the model about user and item general properties such as the global average rating and the bias of both users and items. The second part provides information about user profile and item profile and how they interact. It calculates the implicit feedback present in the items rated by users.

The objective of the model is to minimise the error between predicted ratings and the actual ratings. It is done using gradient descent.

There is another thing that Koren added to the model: the neighbourhood terms. And for some reasons, I didn’t include it. But if you want to dive deeper into the model, you can read his paper here.

The data

First things first, I need a new movie-rating dataset so I can get a good recommendation. I decide to use MovieLens 20M because it’s updated in October last year (so it’s not that old).

Then, I have to make a list of what movies I have watched. Yes, it’s the boring part unless you’re into that sort of thing. I don’t actually know where to start but thank goodness, IMDB Top 250 gives me a great must-watch list. And it turns out I have watched 91 of 250 top rated movies in IMDB (not bad I guess).

What the model recommends

As I mentioned earlier, I used IMDB top 250 to help me to make a list of the movies I’ve watched. I then make a pandas dataframe with movieId in one column and rating in another column. I also added some superhero movies like Spiderman and the Avengers. Then, I throw the list to the model to learn about my personal taste.

Indeed, it’s impossible to list all movies I have watched in my entire life. Using the list I created, the model might recommend me some movies that I have watched before. But, that’s also the good thing because then I can measure the quality of the recommendations I got. Moreover, I personally write a list full of movies and rating, and I believe it’s much better than the list Netflix and Prime have from me (especially because I never rate movies before lol).

Here’s how the model predicts my taste,

recommendation

The result is surprisingly good.

The model predicts that I will give 5 stars if I watched The Dark Night Rises, The Dark Knight, and The Pursuit of Happyness which in other words means that it recommends me to watch them. Actually, I’ve watched all three of them, and I love The Pursuit of Hapyness a lot. It’s one of my favourite movies (but it is not listed as IMDB top 250 so I didn’t put it on my list). Both Dark Night movies are also really good.

I don’t know why the model recommends me Elite Squad, but I’ll definitely watch it. Moreover, they are rated high on IMBD and Rotten Tomatoes.

I’ve also already watched Captain Phillips, Inside Job, Limitless, and Seven Pounds and I love them. The list goes on and on. I found IP man, Kingsman, and The Theory of Everything in the next 10 recommendations. Creepy and jaw-dropping at the same time because I love those movies so much.

Disaster Movie, SuperBabies: Baby Geniuses 2, and Son of the Mask are among the movies I’ll dislike the most. Then to fulfil my curiosity, I look at their reviews on IMDB and Rotten Tomatoes. The model is right.

I think I’ll spend next summer doing movie marathon in my room.


Back to posts


comments powered by Disqus