Top 10 R Packages to be a Kaggle Champion

Kaggle top ranker Xavier Conort shares insights on the “10 R Packages to Win Kaggle Competitions”.



RAcross all major surveys, R has clearly dominated as one of the top programming choices for data scientists. Thus, it is no wonder that knowing the important R packages can be a vital advantage in Kaggle competitions. Xavier Conort (currently Data Scientist at Data Robot) has compiled a list of 10 R packages that played a key role in getting a top 10 ranking in more than 15 Kaggle competitions (including winning a few of them).

Since R is widely being used even outside the data science community (such as by statisticians, actuaries, etc.), this list of top 10 powerful R packages might help you in more ways than you might think.

Here are those 10 packages particularly powerful to build winning solutions:

    Allowing the machine to capture complexity:
  1. gbm [Gradient Boosting Machine]
  2. randomForest [Random Forest]
  3. e1071 [Support Vector Machines]

  4. Taking advantage of high-cardinality categorical or text-data:
  5. glmnet [Lasso and Elastic-Net Regularized Generalized Linear Models]
  6. tau [Text Analysis Utilities]

  7. Making your code more efficient:
  8. Matrix [Sparse and Dense Matrix Classes and Methods]
  9. SOAR [Memory management in R by delayed assignments]
  10. foreach [Foreach looping construct for R]
  11. doMC [Foreach parallel adaptor for the multicore package]
  12. data.table [Extension of data.frame]

 
kaggle
 
Expert Advice for Kaggle Competitions: Use your intuition to help the machine by doing the following:
  • Always compute differences/ratios of features
  • Always consider discarding of features that are "too good"

 
The complete set of slides for this presentation by Xavier Conort: http://www.slideshare.net/DataRobot/final-10-r-xc-36610234

Related:


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!