KDnuggets Home » News » 2016 » Sep » Meetings » Doing the Data Science That Drives Predictive Personalization ( 16:n33 )

Doing the Data Science That Drives Predictive Personalization


Agile collaboration within data science teams is essential to the vision of customer analytics and personalization. Attend IBM DataFirst Launch Event on Sep 27 in New York City to engage with open-source community leaders and practitioners.



Customer analytics is a prime proving ground for many business data scientists. One of their primary tasks is the building, testing, and revising of customer segmentation models.

Segmentation analysis can be a process of seemingly endless iteration. Typically, it involves testing and tweaking feature-engineering models in order to find the specific independent variables that are most predictive of some future scenario of interest. Data scientists identify segmentations through statistical exploration of relationships among variables such as gender, age, ethnicity, nationality, education level, prior purchases, and personality types.

IBM Data Science Personalization 2

Microsegmentation is the art of finding narrower customer segments—-in other words, smaller groups who share fine-grained behavioral affinities. Being able to drill into the entire aggregated population of customer data, including rich real-time behavioral data, enables you to do more fine-grained target marketing, nuanced customer experience optimization, and context-sensitive next best action. Also, if you have ample detail on all the inventory you carry and everything that customers have requested, no matter how seemingly unpopular, you can do powerful long-tail analysis on overlooked product niches of keen interest to specific customer segments.

Pushed to the logical extreme in today’s business environment, microsegmentation can culminate in the vaunted “category of one,” also known as “predictive personalization.” Though predictive personalization may prove game-changing for marketers, it can be a huge burden for the data scientists whose statistical models set it in motion. From a data scientist’s point of view, the requirement that the drive extreme micro-segmentation may encumber them with the following challenges:

  • Burdensome model-governance workloads: For each customer to be targeted 1:1, data scientists need methods and tools for tracking and managing several types of models. In addition to the segmentation model that uniquely identifies customer-specific predictors, there may be linked customer-specific churn, upsell, cross-sell, and sentiment models. Each of those models would have a distinct lifecycle that involves successive versions needing to be tracked and controlled.
  • High-throughput model-automation workloads: If you’re going to be tweaking your in-production segmentation models constantly, as well as all other models associated with those ever-changing segments, you’ll need high-powered automation tools to generate, validate, and promote each version in turn. For data science professionals, automating their data modeling and engineering pipelines is an absolute imperative. It can free them from industrial-grade drudgery and help them better focus on the sorts of exploration, modeling, and visualization challenges that require expert human judgment.

From my perspective, all of this should drive a “next best model” approach for business data science. Under this paradigm, the best-fit microsegmentation model--plus associated churn, upsell, and ancillary models—is always inline to your marketing, sales, and other customer-facing systems. Your model governance tools ensure the “best” part of this paradigm, while your automation tools ensure that the “next” in-line best model is deployed ASAP.

The next-best-model paradigm falls apart if you aren’t able to get the data scientists and subject matter experts in your development team to pool their collective expertise into the segmentation models that drive it all. That’s where the collaboration features in your model-governance environment become key. You should create a data-science collaborative culture and offer incentives that encourage quantitative and domain experts to share and reuse each other’s best ideas.

Agile collaboration within data science teams is essential to this vision. You can never know in advance who has the best insight into the microsegmentation approach that might fit the customer data best. One smart data scientist may prove to be the “next best expert” who makes your extreme-personalization initiative a roaring success.

If you’re a working data scientist, data engineering, or data application developer, register here to attend the IBM DataFirst Launch Event on Tuesday, September 27 in New York. Engage with open-source community leaders and practitioners and learn how to put your data to work driving extreme personalization throughout your burgeoning cognitive business.