The Machine Learning Abstracts: Support Vector Machines

While earlier entrants in this series covered elementary classification algorithms, another (more advanced) machine learning algorithm which can be used for classification is Support Vector Machines (SVM).

By Narendra Nath Joshi, Carnegie Mellon.

Header image
This vector, that vector, every vector

Last post, we discussed a type of classification algorithm, Decision Trees.

There is another machine learning algorithm which can be used for classification, Support Vector Machines (SVM).


Support Vector Machine? What kinda machine is that?

Let’s break it down. (If you thought of some lame jest about machine and it breaking down, you’re like me lol)

Just like any classification algorithm, support vector machines learn to classify or divide any given data point into multiple classes.

The key to understanding SVMs is to study how it does that.

Each data point when plotted visually, can be represented as a vector from the origin. Remember this, we shall call it the point-vector thingy (it is shameful this “point to vector translation” does not have a name)

Hence, when the training data is plotted, the SVM looks for a hyperplane to divide the different classes as much as possible to achieve the maximum training classification accuracy.

Wai whaa, those are a lot of words I’m not comfortable with!

Hyperplane: The plane which divides the space into disconnected parts, where each part can be thought of a class and each point in that part belongs to that particular class

Training classification accuracy: The accuracy of the classifier on the training data itself

Cool, got ’em. Continue!

In order to achieve the best possible performance on unseen data, we have to generalize the classifier as much as possible. (More on this coming soon!)

As a result, the hyperplane the SVM learns has to be the maximal margin hyperplane

MMH, yo!

We know what a hyperplane is. Maximal margin because the margin between the classes has to be maximum

That’s all good, show me an example!

Look at the following figure we will consider as our training data, our task is to classify dots as black or white.

The solid line is the maximal margin hyperplane the SVM learns from the training data.

Ignore the math because these are the Machine Learning Abstracts

The two dotted lines, one for each class, represent the closest training data point in the respective classes.

By the point-vector thingy, the closest vector in each class lie on the dotted line.

In other words, these vectors support the maximal margin hyperplane. If they move, the MMH moves.


I love Seinfeld!

Next, I would like to take a small step back from different classification algorithms and talk about some critical machine learning concepts in general.

Coming soon!

Bio: Narendra Nath Joshi is a graduate student in AI and Machine Learning at Carnegie Mellon University, currently pursuing a research intern at Disney Research Pittsburgh. Has a keen interest in natural language, computer vision, and deep learning.

Original. Reposted with permission.