Don Zereski, VP, Local Search & Discovery, HERE (Nokia) on Location Analytics and Architecture Evolution

We discuss trends in location analytics, evolution of HERE's analytics architecture, infrastructure challenges, data governance and more.

Don ZereskiDon Zereski is the Vice President of Local Search & Discovery at HERE, a Nokia company.  In this role he is responsible for the development and architecture for Local Search, HERE’s social platform and also the company’s places discovery apps such as HERE Explore.  Previously, Don was also responsible for Nokia’s Big Data Analytics team that is now part of Microsoft.

Before joining Nokia, Don was CEO MetaCarta and was acquired by Nokia.  He originally joined MetaCarta as an independent director in August of 2005 and stepped into the CEO role in February of 2009. Previously, Don served as VP of Products for Lycos, where he was responsible for revenue and product strategy of the company’s U.S. properties, including its search sites, Matchmaker, and He led the properties through positive EBITDA and ultimately a sale to Korean portal Daum. Don is a graduate of Worcester Polytechnic Institute and received the school’s prestigious Washburn award in 2002.

Here is my interview with him:

Anmol Rajpurohit: Q1. According to you, what are the current and upcoming trends in location analytics?

Location AnalyticsDon Zereski: Here are three important themes – maturing into an end-to-end solution, higher-level abstractions and going from batch to real-time. 
Once you actually start to collect and manipulate petabytes of data, you quickly learn that managing the data is the hard part, not the Big Data infrastructure itself.  This is especially true as the value of the data grows into a corporate asset. 
Expect to see more tools aimed at addressing data management and collection at scale in addition to data processing.  Also expect to see more and more tools that make it easy to process large volumes of data by pointing & clicking or typing at a command line.  This is akin to the evolution of programming languages from assembler to C to Java and Python.   Finally, expect to see more tools aimed at leveraging large volumes of data for real-time applications like recommendations or fraud detection.

All of the above apply to location analytics but for us real-time is very important.   We see a growing number of applications that will demand applying analytics to personalize experiences like traffic routing or place recommendation on demand.

AR: Q2. How has the HERE analytics architecture changed over the past few years? What have been the key drivers of this change?

DZ: HERE’s Analytics architecture has changed completely over the past few years.   We have gone from an in-house Hadoop-based system to a hybrid setup that uses AWS. AWS  Scalability and flexibility were the key drivers of the change.   In our case, we have many different teams sharing a common data asset.   With Amazon, we house the common data asset in S3 buckets and allow teams to independently run analytics jobs in their own EMR clusters.   This is much better since it allows teams to scale up their EMR clusters as needed and get more work done.

AR: Q3. Why do you consider it important to separate storage and compute for Big Data?

DZ: In cases like ours where you have a large pool of common data and many different teams operating on it then the ability to virtually separate storage from compute is extremely helpful for the reasons mentioned above.   It allows teams to work in parallel on the same data without slowing each other down too much.

AR: Q4. Given the huge amount of data and user requests for HERE services, what have been the biggest infrastructure challenges for you?

DZ: The biggest challenge was scalability and we overcame that by taking advantage of AWS.   Amazon has been able to scale AWS to meet our demands.

AR: Q5. What are your thoughts on how data governance is being perceived currently? How do you see that change in future?

DZ: I think it’s overlooked in many cases or viewed as yet another nagging set of processes.   Once companies run up against some of the issues that data governance is meant to avoid (privacy, security, life cycle, consistency, etc) then they will see value in it.   Governance gets more important as the size and importance of the data asset grows.

AR: Q6. Data Scientist has been termed as the sexiest job of 21st century. Do you agree? What advice would you give to people aspiring a long career in Data Science?

Tips and AdviceDZ: It’s a job that certainly has gotten a lot of attention and hype.  Data Science is all about answering the right questions.  To do this well, you have to understand how to work with data and know what questions to ask.   The best Data Scientists that I’ve worked with have a unique mix of skills in math & statistics, programming and domain knowledge.   The first two are sort of the basic tools of the trade.   The last one, deep knowledge of a domain (i.e. Business or Biotech), is what sets some people apart.   It allows them to understand what questions are best to ask.

AR: Q7. On a personal note, we are curious to know what keeps you busy when you are away from work?

DZ: I’m a fan of water in both liquid and frozen form.   In the summer, you can find me sailing along the coast of Southern New England.   In the winter, you can find me skiing in the mountains of Vermont and New Hampshire.