Wired has an interesting story about the $1 million Netflix Prize contest, This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize. The story goes behind the scenes and talks to several of the research teams, including the mysterious "Just a Guy in a Garage."
Secrecy hasn't been a big part of the Netflix competition. The prize hunters, even the leaders, are startlingly open about the methods they're using, acting more like academics huddled over a knotty problem than entrepreneurs jostling for a $1 million payday. In December 2006, a competitor called "simonfunk" posted a complete description of his algorithm — which at the time was tied for third place — giving everyone else the opportunity to piggyback on his progress. "We had no idea the extent to which people would collaborate with each other," says Jim Bennett, vice president for recommendation systems at Netflix. When I ask Yehuda Koren, BellKor's leader, whether the prize money would go to him and his teammates or to AT&T, he pauses. He seems honestly to have never considered the question. "We got a big prize by learning and interacting with other teams," he says. "This is the real prize for us.""Just a guy in a garage" was the exception to all this openness. He didn't even have a link attached to his screen name, which kept creeping higher and higher on the leaderboard. By mid-January, there were just five teams, out of 25,000 entrants, ahead of him. And still, no one knew who he was or by what statistical magic he kept improving. "He's very mysterious," says Koren with unconcealed interest. "I hope you will at least be able to find out his name."
The rest of the story is available online.
It is great posts like this that I so love about hacking netflix.
Posted by: hueristix | February 27, 2008 at 11:19 PM
Somebody remind me - how much money has been paid out by Netflix on this contest so far to the various contestants?
Posted by: Edward R Murrow | February 27, 2008 at 11:40 PM
netflix awarded the first $50,000 progress prize last November. they will continue to award that sum yearly until someone breaks the 10% barrier. read more about it here:
http://en.wikipedia.org/wiki/Netflix_Prize
Posted by: | February 28, 2008 at 01:48 AM
And from the picture, it looks like he does all his work with a kitten in his lap. Cool.
Posted by: tvindy | February 28, 2008 at 12:12 PM
And from the picture, it looks like he does all his work with a kitten in his lap. Cool.
Posted by: tvindy | February 28, 2008 at 12:13 PM
Only $50,000 awarded? Wow, Netflix has gotten a huge amount of press for just $50,000.
Only the Netflix 'sheeple' take this contest seriously. It's pretty obvious to critical thinkers what this 'contest' was all about and Wired was the latest dumb victim.
Posted by: Edward R Murrow | February 28, 2008 at 03:09 PM
yes, it is an incredibly effective promotional campaign. Netflix has a smart PR department.
Posted by: | February 28, 2008 at 04:36 PM
That is one of the best articles I've read in a long time. It was written by a math professor at Univ Wisconsin. As a psychologist, I appreciated his perspective on psychology, but I disagree with one or two of his comments about human behavior. He implies that some variation in human behavior is unexplainable or unpredictable; therefore the 10% improvement in predictive validity may never be achieved. Yet the inherent contradiction is his argument is that all current prediction in the model is based on quantifying prior human behavior. Just because we don't understand it (yet) does not mean we cannot explain it, or to the point, code it in numerical value on perhaps yet-to-be-thought-of dimensions.
I also liked the sidebar boxes. The reference to the nearly 50 year-old work of Mosteller & Wallace with the Federalist Papers is classic. It is a quantitative analysis of the incidence of certain words used mostly only by Hamilton, Jay, or Madison, which revealed the authorship of the essays whose authorship was in dispute (most were by Madison).
What the word-counting analysis in the Federalist Papers has to do with the Netflix prize algorithms, was not clear to me; could anyone explain? Obviously, human behavior, including our writing, can be analyzed quantitatively, but how to apply that to the Netflix Prize algorithm?
The poor rating scale itself may be the inherent limitation with any further improvement in the algorithm predicting future ratings. The 5-point scale is crude. Since Netflix customers rarely rent, and subsequently rate, films they don't like, the 5-point scale is functionally a 3-point scale (3, 4, and 5). I'm not sure if the "Not Interested" rating is coded in the data set as a Zero, but that rating needs to be analyzed separately, as interest in a film lies on a different dimension of attitude from liking, or not liking, a film.
I've rated 1780 films and not one rating is the "Not Interested" value. In a recent topic's posts, many other readers of this site said they'd never used that rating value either.
The 5-point scale, with its 3 points that are used most of the time ratings are made, and the other 2 that are rarely used, could easily be converted, without losing existing ratings data, by adding the values 1.5, 2.5, 3.5, and 4.5.
Then Netflix would have a 9-point scale, of which 6 or 7 would be frequently used, and thus allow quantitative models to predict with greater accuracy than currently.
My other problem with the article was the extended discussion of Tversky & Kahnmann’s Anchoring work. While it is a valid concept, and recently featured as “new” research by a behavioral economist on NPR a few days ago, in which the last 2 digits of peoples’ social security numbers influenced the amounts they bid on hypothetical items (those with “68” as the last 2 digits bid more than those with “12”), we have been including Anchoring in our psychology courses for at least 30 years (that I know of) and it has been a staple in the cognition chapters in our texts for at least that long. What application it has to the Netflix prize algorithm appears rather small, but small improvements may be what wins that prize.
Posted by: profpudwick | February 29, 2008 at 06:11 PM