Interview: Lei Shi, on Analytics behind the Perfect Match

We discuss analytics at ChinaHR, matching job seekers and employers, traditional job fairs vs online recruitment, key metrics and analytical insights.


Lei Shi is currently the CTO of ChinaHR, one of China's top online recruitment websites. Lei joined Microsoft Research Asia in 2005 as a researcher in the area of Natural Language Processing. Later he joined Yahoo!, in charge of Yahoo!'s search product development in Beijing. He has extensive experience in leveraging big data and machine learning techniques in solving many complex problems.

Here is my interview with him:

Anmol Rajpurohit: 1. What does do? How important is Analytics to

chinahrLei Shi: is a leading online recruitment website in China. It helps tens of millions of Chinese find their jobs every year. Job seekers can submit CVs and apply for jobs there for free, while we charge employers for posting jobs and purchasing CVs. As most of our service is based on user generated content (UGC, such as jobs, CVs), analytics on these UGC data is essential and it can be applied to many areas of our service. A key area would be providing "perfect match" to our employers and job seekers as this is their primary goal for using our service.

Analytics on the jobs and CVs of successful employment based on their profiles to increase the likelihood of employment can significantly improve the effectiveness and efficiency of our service. Analytics can also help employers and job seekers on our website to refine their job postings and CVs to maximize their chance of employment. In addition, Data Mining can also facilitate lead generation to help us better identify our potential customers in order to increase revenue.

AR: Q2. How do you define the "Perfect Match" between job seekers and employers? How do you measure the success of matching?

LS: The "perfect match" here between job seekers and employers is quite different from other applications, perfect-matchsuch as web search. In a web search engine, when a user types in a query, the system intends to retrieve documents as much relevant as possible. But in online recruitment, we cannot simply recommend candidates with the highest caliber to employers, because they may over-qualify the job or deny the employer's offer. It's vice versa from job seekers perspective as well. So the "perfect match" here should be rather perceived as a mutual match. They should meet each other's criteria and interest, but not over qualify at the same time.

A mathematical model is to be built to estimate the degree of match given a pair of job and CV. The success of matching should be measured by concrete actions such as increase in job applications, CV purchase as well as actual employment through our online service.

AR: Q3. How do you compare traditional job fairs and online recruitment in terms of efficiency and effectiveness?

LS: Traditional job fairs require job seekers and employers to physically go to a location to share their CV and job information. Its temporal and spatial constraints significantly limit its efficiency and effectiveness. Meanwhile, online recruitment service is available 24x7, and it's not confined by physical boundaries.

The convenience of access to information through Internet has given rise to the first revolution in recruitment, bringing it from offline to online. Therefore in the online recruitment website, both employers and job seekers job-faircan conveniently access a much larger collection of jobs and CVs than in a traditional job fair. And it has led to great success of many online recruitment websites. However, with abundant information available in the online recruitment website, users are faced with information overload. For instance, how can an employer find the most suitable candidate out of tens of millions of CVs is really a challenge. So in addition to submitting jobs and CVs, users' primary goal is to find their jobs or candidates in online recruitment websites.

I believe that an intelligent algorithm that leverages big data and analytics to find "perfect match" will bring another revolution to recruitment business.

AR: Q4. What kind of Analytical insights are helpful in hiring decisions?

LS: Hiring decisions are normally based on many factors such as education level, relevant skill sets, age hiring-decisionetc. However, given huge volume of data, it is impractical for human to screen every one of them and pick the best match. With data analytics and machine learning techniques, these factors can be captured mathematically as features and the machine learning algorithm can automatically learn how these factors affect the hiring decision according to successful match data as reference. In addition, with unsupervised learning techniques such as topic models, LDA (Latent Dirichelet Allocation), we are able to even discover new factors that are hidden in the big data.

AR: Q5. What metrics play a key role in your optimization criteria for employer and job-seeker matching?

recruiting-metricsLS: Major user activities in online recruitment websites are job applications by job seekers and CV purchase by employers, and they are key factors in good user experience. As we mentioned that "match" is defined as mutual, and its success is measured by actual interview or hiring action, we shall not simply optimize job application or CV purchase number as the optimization criteria individually. This not only optimizes user experience for job seekers and employers, but also website revenue.

Second part of the interview