KDnuggets Home » News » 2011 » Jun » Software » The Netflix Prize, Big Data, SVD and R  ( < Prev | 11:n14 | Next > )

The Netflix Prize, Big Data, SVD and R


 
  
Bryan Lewis shows how to use IRLBA R package to do SVD on the Netflix Prize data set


Date:

David Smith, Revolutions Blog, May 31, 2011

One of the key data analysis tools that the BellKor team used to win the Netflix Prize was the Singular Value Decomposition (SVD) algorithm. As a file on disk, the Neflix Prize data (a matrix of about 480,000 members' ratings for about 18,000 movies) was about 65Gb in size -- too large to be read into the standard in-memory data model of  open-source R  directly. But in the video below, Brian Lewis shows us how to use the sparse Matrix object in R to efficiently store the data (about 99 million actual movie ratings) and the irlba package (which features a fast and efficient SVD algorithm for big data) to perform SVD analysis on the Netflix data in R.

blog.revolutionanalytics.com/2011/05/the-neflix-prize-big-data-svd-and-r.html

Big Computing: Bryan Lewis's Vignette on IRLBA for SVD in R


 
Related
Data Mining Software

KDnuggets Home » News » 2011 » Jun » Software » The Netflix Prize, Big Data, SVD and R  ( < Prev | 11:n14 | Next > )