Automotive Customer Churn Prediction using SVM and SOM

A Case Study of predicting customer churn using Life Time Cycle approach and advanced machine learning methods including SVM and Self-Organizing Mapping.

By Gregory Philippatos (, Sep 2014

A worldwide leader automotive company, faced a daunting challenge for its After Sales Service business and more specially for its Authorized repairers: "How to reduce risk exposure, better serve and market to its customers"?

In EU economic crisis had a negative impact for after sales market, a market that was already in a transition between a promising potential and a difficult reality as client’s cant perceive easily maintenance as product that brings a tangible value, because new cars are built to last longer and require maintenance less frequently and due to new regulations of European Commission (The new competition law 2010).

In this context Life Time Cycle of each client is critical and specially Churn as it affects the length of the service period and, hence, future profit generation.

Challenge: Customers Annual Churn Prediction

Based on market actual situation and historical data  (Figure 1) we defined our prediction objective as follow:

  1. Annual churn prediction for in-warranty customers (car age <4 years old)
  2. Annual churn prediction for customers near to the end of warranty (car age >4 and <7)We must also add a macroscopic point of view on life time cycle and churn offering the necessary time to decision makers to create successful business and marketing strategy targeting increase of visits for service
  3. Evolution analysis of Customer Life Time Cycle associated with churn prediction

Car maintenance lifetime chart

Figure 1. Evolution of maintenance Life Time Cycle in relation with car age

The Solution: Directing Intelligence in Business

Data analysis is a vital component in strategic planning for companies that are aware of worldwide competition, ever-shorter production cycles and increasing customer requirements. Due to actual speed of communication through internet of things it is important to identify meaningful patterns quickly within the collected data.

DIRECTING mission is the design of knowledge architectural plan as part of business engineering in each enterprise and the creation of data mining applications offering to users that not necessarily have a statistical background to assess and understand the identified patterns.

This has been accomplished by the initial concept of DATACTIF®, a Data Mining Platform able to generate concept-applications tailor made for each enterprise needs and in same time bringing a 15 years experience of learning processes, accumulating knowledge and finally finding solutions to problems in industrial, financial, retail sectors.

DATACTIF® uses machine learning methodology and algorithms such as neural network, Kohonen SOM with U-Matrix visualization, fuzzy systems, genetic algorithms, Support Vector Machines, etc… and contains visualization methods that allows a global view on the domain that is under analysis, and an analytical view to all details offered by existing data.

Churn Prediction Methodology

Predictive modeling is used to forecast a particular event. It assumes that an analyst has a specific question ask. The model provides the answer by assigning ranks, which determines the likelihood of certain classes.  In our case that the Automotive Company to predict which customers are likely to stop maintenance or not, it have to prepare for predictive modeling by feeding data about two types of customers into the data mining tool: data of customers who have stopped and data of customers who have continued. We applied predictive modeling to a sample of 500.000 customers with their maintenance historical data concerning the period from 2009 to end 2013.

We used also January 2014 data for one training set and for verification of our models real visits for maintenance between 1/2/2014 and 31/7/2014

In more detail, the training data consists of the state x that describes certain instances of the problem and the desired response.

In the case of cars <4 years we selected randomly 3.000 customers of our data base as training set of whom 652 made a maintenance service on January 2014 and the rest (2.348) they did not.

In the case of cars >4 years and <7, we defined churn customers as those who have not made any visit for service in the past three years (2011, 2102, 2013) and not-churn customers who made service every year over the past three years and that combined with January 2014 results. We selected randomly 5.000 customers as training set. We used as input variables:

CAR_AGE Normalized from 0 to 1
KLM ""

After the training phase our model predicts the parameter values for new cases (not included in the training data). We used the polynomial kernel with support vector machines (SVMs) that represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models.

Churn Prediction Results

2014 Churn Prediction Results

  • for cars <4 years age: Prediction Accuracy= 67.0%
  • for cars 5 years age: Prediction Accuracy=79.5%
  • for cars 6 years age: Prediction Accuracy=74.3%

Average accuracy: 76.8%

Results verified with real visits of 1/2/2014 - 31/7/2014

This is part 1 - here is part 2.

Gregory Philippatos is the principal at DIRECTING Intelligence in Business, based in Athens Greece, which creates Business Driven Analytics, applications of machine learning theory,  tailored to specific industries and in each sector of industries to specific needs and business requirements of every company.