The Guerrilla Guide to Machine Learning with R
This post is a lean look at learning machine learning with R. It is a complete, if very short, course for the quick study hacker with no time (or patience) to spare.
Sure, there are lots of tutorials and overviews on gaining the insight you need into picking up machine learning, but many (most?) of them take the long view: get a foundation first, learn the basics next, then learn a bit of complementary theory before getting too far ahead of yourself in practical terms, take a step back, try your hand at a few examples, undertake a project on your own... This is all great advice, and a great approach to learning... well, almost anything.
I know, I know... still not funny.
But let's say you're not starting from scratch. Or you're a savant. Or you don't have the patience to go through all of the motions. Let's say you want to hit the ground running and scramble under pressure to learn everything right now. The best approach? Ideally, no, but I'm in no position to judge. I work best under pressure, and can sympathize with the impatient among us who just want to get on with things.
Let's be clear: this is assuredly not the path to achieve greatness quickly. With learning machine learning -- much as with machine learning itself -- there is no free lunch. However, getting a practical overview for the purposes of testing the waters and deciding whether learning more about the topic is what you are after, or if you already posses a solid theoretical understanding of much of these (or related) concepts, the way of the guerrilla may be for you.
With that in mind, here is a bare bones take on learning machine learning with R, a complete course for the quick study hacker with no time (or patience) to spare.
Preparing and Learning R
First step is always first: let's learn R. Or, let's learn R quickly. You will need to install R and R Studio -- the programming language and a GUI programming environment for R, respectively. There are all sorts of videos on getting this accomplished, which vary from operating system to operating system. As such, I will leave you to your own devices to find a video for your personal configuration.
However, if you are interested in some words that will help get this installation task accomplished, have a look at Installing R and RStudio from McMaster University:
Once you have R installed, no doubt you are going to want to know how to use it (that's why you're here, right?). This first video, by Brian Palmer, is an introduction to basic R commands, approached from the perspective of introductory statistics.
This following video by Ralf Becker looks at some very basic data analysis of a dataset, which is good prep for the next section.
Machine Learning Crash Course
Next, let's have a look at an introductory machine learning overview. While there are all sorts of videos on this topic -- including many I have used and/or suggested in the past -- I like to keep it fresh, and so we will use the following video from Nando de Freitas (of University of British Columbia at the time of recording). Note that this is not at all R-specific, and covers machine learning conceptually as opposed to practically.
This is a great introduction to machine learning from a great mind of the field. Incidentally, de Freitas has many other great machine learning videos available on YouTube, which would be worth checking out if you are looking for further materials on the subject.
R for Data Analysis
Though this next pair of videos include the term "data analysis" in their title, they veer into what could easily be considered machine learning territory. The videos go slow, however, and cover a lot of descriptive statistics in R, which is a good place to ramp up with the language. These great videos are created by Dave Langer of Microsoft.
Machine Learning with R
To continue our foray into machine learning with R, have a look at this video by Bargava Subramanian, titled "Machine Learning Using R: Crash Course In Classification Methods." The video does a good job of bridging the theory introduced 2 sections back with practical implementations in R.
Advanced Machine Learning in R
I hesitate to call this advanced machine learning, but in the context of what we have seen thus far I think the label works.
We will have a look at 2 distinct machine learning topics. The first is document classification, in this video by Tim D'Auria, which shows "How to Build a Document Classifier in Under 25 Minutes Using R." Fun, right?
Finally, Hamed Hasheminia gives a quick look at implementing neural networks in R:
Looking for More?
A logical next step may be covering individual algorithms in more depth, and these are a few additional resources in this direction:
- R Learning Path: From beginner to expert in R in 7 steps
- Building Regression Models in R using Support Vector Regression
- A Beginner’s Guide to Neural Networks with R!
Hopefully this is enough to get the motivated hacker up and practicing with machine learning in a few days. It's not a complete course, however; understanding the underlying statistical building blocks of machine learning takes years, and becoming proficient in the practical will take hundreds of hours of doing. However, no one said you can't get your hands dirty and have some introductory fun on the way to perfecting a new craft.