The Netflix Prize requires developing a new rating algorithm that improves by over 10% the current system called Cinematch that is used by Netflix to suggest movies to its customers. According to the contest’s Leaderboard, it looks the $1,000,000 Grand Prize will be awarded shortly.
The Netflix Prize provides some interesting lessons in analytic strategy. In addition to the Grand Prize, each year until the Grand Prize is awarded, a $50,000 Progress Prize is awarded. The Progress Prize was awarded in 2007 and 2008. The Netflix Prize has become quite well known due to the prize money being offered. It deserves to be just as well known for the analytic strategy they chose.
It is relatively common for a company or an organization to spend a million dollars on an analytic project. It is less common for something useful to result from it. I don’t have any inside knowledge about the Netflix Prize, but I think that there are several valuable lessons about analytic strategy that the Netflix Prize illustrate. Here are three lessons.
Lesson 1. Agree upon a metric to measure the effectiveness of an analytic model and use it consistently. It is usually not possible to find a single metric that captures all the relevant information required when comparing two analytic systems. It is certainly the case that any actual ratings system requires several metrics. For example, one metric might measure how many stars a viewer would assign to a move and another metric might measure how often a viewer selects movies recommended. On the other hand, by singling out a single metric, it becomes straightforward to compare two recommendation algorithms. Once this is possible, it becomes simple to use the metric to create a dashboard (the Netflix Leaderboard) and then to use the dashboard to track progress. Netflix chose to use the root mean squared error (RMSE) between the predictions of a proposed system and actual choices made by users in a validation dataset. Over 49,000 contestants from over 180 different countries formed over 40,000 teams and entered the contest and tried to develop a recommendation algorithm with a low enough RMSE to win the Grand Prize. In my experience, most companies and organizations lack the discipline to use a single (lead) metric to compare two analytic systems and to use the metric to track progress improving an analytic system over time using a dashboard. Having the discipline to do so is one sign of the analytic maturity of a company.
Lesson 2. Don’t be afraid to disclose analytic technology you develop if the advantages outweigh the disadvantages. In general, it makes sense for companies and organizations not to disclose the proprietary technology they use. On the other hand, there are some important exceptions.
- One exception are patents. Patents provide some important protections, but the trade off is that the technology must be disclosed in the patent filing.
- Another exception is when the software of an internal analytic project is made open source or when an internal project decides to contribute to an existing open source software project. Again, there is a trade off. Some technology is disclosed, but the benefit is the community support that many open source projects engender.
- Crowdsourcing is a similar type of exception. The benefit is the innovation that crowdsourcing can provide. The downside is that crowdsourcing discloses technology that may be critical to your business. Netflix found that with Cinematch customers rented more movies and were less likely to cancel their subscriptions. Cinematch was introduced in 2000 and improved each year until a plateau was reached in 2006. In the summer of 2006, Reed Hastings, the CEO of Netflix, suggested a public contest to improve Cinematch. According to an article in the New York Times, “Cinematch suggestions… drive a surprising 60 percent of Netflix’s rentals.” By setting a threshold for the prize of 10% or more improvement, Netflix would obtain enough incremental revenue from an improved Cinematch system to make up for any information that Netflix’s competitors might gain. Again, this is a good analytic strategy.
Lesson 3. Double and triple check any data before making it public. No company or organization would knowingly make data public that contains personally identifiable information (PII) without permission. On the other hand, even if data does not contain PII per se, often times PII can be inferred from data, as was done when AOL released 3 months of sample query logs in 2006. For less obvious ways to break anonymization of data, see the paper Wherefore art thou r3579x?. In some cases, it can be quite challenging to take data and to anonymize it so that it does not contain PII information, especially if the data is being updated. On the other hand, making data public enables a broad community to contribute to your problem.
Finally, it is interesting to think about the size of the data used for the prize. The data consisted of over 100 million movie rating files by 480 thousand randomly-chosen, anonymous Netflix customers. The rated over 17 thousand movie titles during the period October, 1998 to December, 2005. In some sense, this is a lot of data. Certainly there are a lot of degrees of freedom in the dataset. On the other hand, it is less than 2 GB of data and easily fits in the memory of a modest size computer. From this perspective, it is a small amount of data. From the view point of analytic infrastructure, it is useful to classify data as small (fits into the memory of a single computer), medium (fills the disks of a single storage device or fits into a database), or large (requires specialized infrastructure such as a cloud).
For more information:
- R. M. Bell and Y. Koren, Lessons from the Netflix prize challenge. SIGKDD Explororations Newsletter, Volume 9, Number 2 (Dec. 2007), pages 75-79. DOI= http://doi.acm.org/10.1145/1345448.1345465 (subscription required)
- Clive Thompson, The Screens Issue. If You Liked This, You’re Sure to Love That, New York Times, November 23, 2008 (registration required).
- L. Backstrom, C. Dwork, and J. Kleinberg, Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography, Proceedings of the 16th international Conference on World Wide Web (WWW ‘07), ACM, New York, NY, 181-190. (subscription required)
Upcoming Course. I’ll be using this example in an upcoming course I’m teaching in San Mateo on July 14, 2009.
