Arvind Narayanan and Vitaly Shmatikov from the University of Texas have published a paper that claims that the Netflix Prize customer data can be analyzed to expose the users, "How To Break Anonymity of the Netflix Prize Dataset.
We demonstrate that an attacker who knows only a little bit about an individual subscriber can easily identify this subscriber’s record if it is present in the dataset, or, at the very least, identify a small set of records which include the subscriber’s record. For example, an attacker who knows the subscriber’s ratings on 6 movies that are not among the top 100 most rated movies and approximate dates when these ratings were entered into the system has an 80% chance of successfully identifying the subscriber’s record. This knowledge need not be precise, e.g., the dates may only be known to the attacker with a 14-day error, the ratings may be known only approximately, and some of the ratings may even be completely wrong. Even without any knowledge of rating dates, knowing the subscriber’s ratings on 6 movies outside the top 100 most rated movies reduces the set of plausible candidate records to 8 (out of almost 500,000) for 70% of the subscribers whose records have been released. With the candidate set so small, deanonymization can then be completed with additional human analysis.
I publish my main Netflix queue on the side of this site, and I have a ton of Netflix friends, but I can understand how some people would be concerned if a co-worker, family member or friend knew what movies they were renting.
via Digg.
I, for one, would be mortified if friends or family discovered that I had seen Michael Bay's "The Island." What a waste of two hours.
Posted by: hawk5391 | March 01, 2007 at 12:52 PM
It's not as important as medical records or financial information, but most of us in the US of A do expect a right to privacy.
Posted by: Edward R Murrow | March 01, 2007 at 03:13 PM
I'm curious: how do you publish your netflix queue? Is it a plugin or something you maintain (or that Typepad does for you?)
Posted by: swanksalot | March 01, 2007 at 06:06 PM
boondoggle.
Posted by: | March 02, 2007 at 02:10 AM