How do you know what you like? Netflix prize edition

The Netflix prize has provoked a blizzard of data-mining response to this question: how, given knowledge about how you have rated movies in the past, will you rate particular movies going forward? The main goal of the prize, which is ongoing and worth a million bucks if you can do it, is to ‘improve’ the Cinematch algorithm they currently use to make guesses about what you like. As this fascinating Wired piece notes, it’s the difference between saying, You liked Squid and the Whale? Here’s Margot at the Wedding – as opposed to a fishing documentary on Jacques Cousteau or Moby Dick. A 10% reduction in error will net you $1M.

To that end, Netflix has put a ginormous data set up for your pleasure to work with, and your aim is to ‘mine’ that data and then produce an algorithm that will ‘predict’ what people will rate on their test data set. So they’re basically giving you 9 years of a 10 year data set, and asking you to predict how people rated movies in year 10.

So, data mining heaven. As the Wired article points out:

Many of the contestants begin, like Cinematch does, with something called the k-nearest-neighbor algorithm — or, as the pros call it, kNN. This is what Amazon.com uses to tell you that “customers who purchased Y also purchased Z.” Suppose Netflix wants to know what you’ll think of Not Another Teen Movie. It compiles a list of movies that are “neighbors” — films that received a high score from users who also liked Not Another Teen Movie and films that received a low score from people who didn’t care for that Jaime Pressly yuk-fest. It then predicts your rating based on how you’ve rated those neighbors. The approach has the advantage of being quite intuitive: If you gave Scream five stars, you’ll probably enjoy Not Another Teen Movie.

BellKor uses kNN, but it also employs more abstruse algorithms that identify dimensions along which movies, and movie watchers, vary. One such scale would be “highbrow” to “lowbrow”; you can rank movies this way, and users too, distinguishing between those who reach for Children of Men and those who prefer Children of the Corn.

Of course, this system breaks down when applied to people who like both of those movies. You can address this problem by adding more dimensions — rating movies on a “chick flick” to “jock movie” scale or a “horror” to “romantic comedy” scale. You might imagine that if you kept track of enough of these coordinates, you could use them to profile users’ likes and dislikes pretty well. The problem is, how do you know the attributes you’ve selected are the right ones? Maybe you’re analyzing a lot of data that’s not really helping you make good predictions, and maybe there are variables that do drive people’s ratings that you’ve completely missed.

BellKor (along with lots of other teams) deals with this problem by means of a tool called singular value decomposition, or SVD, that determines the best dimensions along which to rate movies. These dimensions aren’t human-generated scales like “highbrow” versus “lowbrow”; typically they’re baroque mathematical combinations of many ratings that can’t be described in words, only in pages-long lists of numbers. At the end, SVD often finds relationships between movies that no film critic could ever have thought of but that do help predict future ratings.

Fair enough. What’s interesting is that a psychologist working by himself out of him home seems to have pulled from nowhere into something like 5th place. What he’s been doing is exploiting knowledge about bias effects into his model – in other words, he has a better SVD reduction. As the article puts it, “…The computer scientists and statisticians at the top of the leaderboard have developed elaborate and carefully tuned algorithms for representing movie watchers by lists of numbers, from which their tastes in movies can be estimated by a formula. Which is fine, in Gavin Potter’s view — except people aren’t lists of numbers and don’t watch movies as if they were.”

I think what is most responsible for the breakthroughs of the psychologist is that he has a theory of taste, not just a data-mined algorithm. The math is the math (he has his 17-year-old daughter running the calculus), but if you matched a better theory of taste with some elbow grease, it may be possible to beat the bigger data-miners. Any takers?

Comments are disabled for this post