Regression & Correlation for Military Promotion: A Tutorial

A clear and well-written tutorial covering the concepts of regression and correlation, focusing on military commander promotion as a use case.

The Problem


The strength of a military depends on its commanders. While there are commanders who risk their lives to save others, there are those who betray fellow comrades for personal gain. How do we select the right people into positions of command?

As military recruits undergo various assessments, we might have a database of soldier attributes, including physical fitness, IQ, personality traits, etc. However, we need to know:

  1. Can any of these attributes be used to predict commander potential?
  2. If so, how would they rank in their prediction accuracy?
  3. How do we combine multiple attributes to further improve predictions?

Besides commander selection, prediction techniques can also be used to identify personnel on the opposite end of the spectrum – soldiers who are likely to maladjust. Beyond HR, the applications for prediction techniques are endless: weather forecasting, stock market pricing, population dynamics, actuarial science, etc.

An Illustration

Back to our example on commander selection. Let’s say we collect data on a group of soldiers. We have their physical fitness and IQ scores (attributes), as well as supervisor ratings on their commander potential. While we can collect data on fitness and IQ scores at the time of recruitment, data on commander potential is based on soldiers’ performance throughout commander school, and is only finalized upon graduation.

It would be much better if we can predict soldiers’ commander potential earlier on, so that we can shortlist only the most promising candidates for commander school.

To see if any attribute can be used to predict commander potential, scatterplots come in handy:

regression tutorial HR selection with fitness
regression tutorial HR selection with IQ

From the above plots (based on mock data), we can observe consistent trends – fitter and smarter soldiers are likely to be better commanders.

Deriving these trend lines is the key to making predictions. For example, if a soldier scored 85 for IQ, the trend line predicts that he would score 70 for commander potential.

The question is, how are trend lines derived?