Enhancing Machine Learning Personalization through Variety

Personalization drives growth and is a touchstone of good customer experience. Personalization driven through machine learning can enable companies to improve this experience while improving ROI for marketing campaigns. However, challenges exist in these techniques for when personalization makes sense and how and when specific options are recommended.

By Raghavan Kirthivasan, Director of Data Science at Epsilon India, and Anuj Chaturvedi, Lead Data Scientist at Epsilon India.


Businesses generally run campaigns of 8-10 weeks duration with weekly e-mails sent to the reachable customer base. Since the customer’s purchase pattern depends on the nature of products in the product catalog, the time to the next purchase is usually a month or more, depending on the category. As a result, for most of the customers, the content being sent across the weekly campaigns is usually the same because the model recommendations do not change weekly based on the historical data. Therefore, stagnant recommendations over a period of 3 to 4 weeks may lead to a bad customer experience.

On the flip side, based on the frequency of purchase, sending e-mails with similar content may also serve as a reminder in case the customer misses any of the previous e-mails. Hence, the strategy of sending repeat e-mail with the same recommendation may also lead to increased revenue.

This article looks at the personalization of content being sent through e-mails for marketing campaigns, such as offers at an SKU, Product, Category, etc., and outlines how the concept of variety can be used for the same.

Desired State and Limitation

The ideal situation would be a scenario where we can show the customer more options at a weekly level rather than at the monthly level, with recommendations from the ML model to improve the overall customer experience without sacrificing the ROI. We will also talk about the different stages in the campaign that are ideal for introducing such a strategy.

The idea is to devise a strategy that introduces variety in the recommendations where possible such that the model recommendations do not change and the campaign lift is not sacrificed.

As mentioned before, since the recommendations from the ML model usually do not change frequently, the idea is to identify the categories outside the top 3 or 4 recommended categories by the ML model, which the customer is most likely to purchase without sacrificing the ROI.

Proposed Solution

Let us assume that there are 50 categories that have been earmarked for a campaign. These are the categories from which the client would like to send e-mails to their customers and inform them about the discounts available on the categories most relevant to their customers. We call this group of categories as the “Choice Pool.”

The expectation from the Data Science Team is to identify the top 3 or 4 categories which the customer is most likely to purchase within the “Choice Pool.” While building a model, the model will stack rank all the categories within the choice pool from which we select the top 3 or 4 categories we will send in the e-mail.

Variety is not optimized for what the customer is going to buy but is suggestive of the additional products that the customer may buy.

Introducing variety

Various actions can be taken with respect to variety introduction depending on where we are in the problem-solving stage. It makes more sense to break the problem further into the following questions:

  • Where to introduce variety?
  • When to introduce variety?
  • How to introduce variety?

We will try to separately discuss and find out the potential answers to these questions.


Where to introduce variety?


This will help us identify the right set of customers for whom it might make sense to provide variety in recommendations. Depending on what stage we are in the model life cycle, the strategy to answer this question might change.


Before the model goes into production, there are a set of dimensions that could help us identify a good strategy for mixing variety in recommendations.

  1. Model strength for the choice recommendations: This can be understood by analyzing how far apart the model probabilities for different choices of the pool are. More the distance, more confident is the model in identifying the relevant set of choices for the customer
  2. Historical Choice Pool of the customer: The set of choices that the customer has purchased or the history available for the customer

The following table combines both the aspects and tries to identify the set of actions basis different combinations:

The above strategy could be explored before the model goes into production. But following this, we have to look into how we make decisions after the model has gone live.


Once the model goes live into production, we have a crucial extra piece of information, i.e., how the customers' actual choices fare vis-à-vis the model recommendation. This new information can prove to be very imperative in further refining the decision on which customers are more likely to be impacted positively through a variety of recommendations.

There might be different ways to use this information and make decisions. Below is one way to go around using the customers' actual purchase behaviors to identify the set of customers who might be benefitted via a variety of recommendations.


When to introduce variety?


Since blending variety into recommendations might not work out in all scenarios, we try to search for campaign criteria where introducing variety seems logical.

  1. Length of campaign: It seems apt to introduce variety in recommendations where we are dealing with campaigns of long duration where the customers get exposed to the campaign multiple times and hence a chance for the customer to see the same recommendations across multiple e-mails.
  2. Campaign runtime before variety introduction: Some time needs to be given for the campaign to mature before introducing variety in recommendations as the initial results might not provide an accurate read. About 2 to 3 weeks seems to be a good time that would allow the campaign to mature before introducing variety and would enable us to make more informed decisions.
  3. Choice Pool size: The number of choices in the pool is a factor that plays an important role in variety-related decisions. The decision around variety becomes insignificant in cases with a very small choice pool as it leaves very few options to try out. A choice pool with 20 options seems like a good place to start. We can also think about it in terms of the percent of options that are left after one run, e.g., we have to recommend 4 choices of the 20 choices pool, which leaves us with 80% of choices in the pool after the first recommendation cycle for the customer.


How to introduce variety?


There could be many ways to blend variety into recommendations in cases where the model recommendations do not change over time due to certain reasons. We are going to explore a few of the technical and heuristics-based ways here.

Market Basket Analysis

This is one of the technical ways to bring in variety and is based on Association Rule Mining. Association Rule Mining is a technique that identifies the strength of association between pairs of products purchased together and identifies patterns of co-occurrence. A co-occurrence is when two or more things take place together. Association rules do not extract an individual's preference, rather find relationships between sets of elements of every distinct transaction.

In layman's terms, Market Basket Analysis (MBA) identifies the sets of products that are generally bought together.

One way is to leverage the model to find the choice with the highest purchase probability, next use MBA to identify the corresponding set of products that have the highest association with the model recommended choice, and use these choices as recommended choices.

We will now look at some heuristic-based ways to add variety to our recommendations.

Price-based Variety

One rule-based approach to include variety could be to consider price as the foundation for categorizing the choices into groups. This kind of categorization enables us to provide recommendations to the customers from across the price band range.

We could go ahead selecting the choices with the highest model probability of being bought from each of these price-based groups and provide variety in the form of the options that have not been presented to the respective customer yet. This way also provides a way to understand if providing variety in the form of different priced options gels well with the customer base.

Product Category-based Variety

Another rule-based approach that can be followed is to use choice categories defined by the business as the basis for selecting choices and provide variety across different categories. Similar to the last strategy, we could present the highest model likelihood options from each of the choice categories that have not been presented to the respective customers yet.

Conclusion and Future Scope

Through this article, we have tried to explore a very practical problem that is seen with the recommendation models being used today.

The variety can be introduced in cases where the personalized recommendations due to various reasons don’t change in every send. We have discussed ways of bringing in variety without sacrificing the campaign performance.

The above-discussed strategy is a hybrid of analytical and heuristics-based approaches and should act as a starting ground for any similar problem faced by businesses in personalizing the recommendations. The next steps could vary from project to project depending on what prelim results have been seen after leveraging this approach.


Explanation of the Market Basket Model, Information Builders.

Bio: Raghavan Kirthivasan is a Director of Data Science at Epsilon India. He has 18 years of experience in Data Science/Analytics with functional expertise in Marketing/Risk management and Fraud Analytics across geographies (US, UK, APAC). In his previous roles at WNS, AIG, and Epsilon Agency, Raghavan incubated the Data Science teams.

Anuj Chaturvedi is a Lead Data Scientist at Epsilon India. has close to 9 years of experience in data science, leveraging various data assets to support different business areas including Digital Experience, Customer, Marketing & Pricing. He currently works as a Lead Data Scientist at Epsilon developing ML solutions which target to improve customer experience in the digital space.