Netflix Prize Data Dissected
Ilya posted his analysis of the more than 2 gigabytes of Netflix prize data on his blog.
An interesting one once again, the mean is 3.8! I guess on average we are satisfied with our movie choices - which makes it only so much harder for a ML algorithm to find an ‘even more’ interesting movie. One pattern that you cannot observe here is that the mean tends to drift to the right as the number of users grows - in 1998 the average was 3.4, by 2005 it steadily moved to 3.8. I wonder what accounts for the drift? Are early adopters (techies) more discerning in their ratings/choices? Are the movies getting better? (Doubt it!)
Ilya has also posted the graphs on Flickr.
via Netflix Fan Blog.








