Who dominates analytics job market?
Peter Bruce reexamines who dominates analytics job market and comes up with a different answer. His analysis shows 1.92 SAS jobs for every R job, but SQL and Java are the skills most in demand.
By Peter Bruce, Statistics.com, June 6, 2013
Bob Muenchen's very useful work on this topic, SAS Dominates Analytics Job Market; R up 42%sent me back to some 2012 work we did at www.Statistics.comon the subject of what employers are looking for in the way of analytics skills. First, our main results:
1. Our numbers showed a much less SAS-dominant world: 1.92 SAS jobs for every R job. Bob had found the ratio to be 11.06 for every R job.
2. Like Bob, we found that R was gaining relative to SAS - in our May scrape the SAS/R ratio was 2.44, while in December 2012 it had declined to 1.92.
About our methodology: Rather than using Indeed.com, a consolidator, we did web scrapes customized to Dice, Amazon, CareerBuilder, and Monster. Indeed.com includes most of those sources, but web scrapes are difficult to configure for indeed.com, since it is a consolidator and the job postings can take different formats, depending on the original source.
The Amazon scrape is for jobs at Amazon itself (Amazon does not run a job market); it was included not to be comprehensive but to have a representative entry from a large tech-oriented firm that does its own hiring. We searched for jobs posted in the previous 7 days and used the search terms: analytics or forecasting or statistician or "data mining" or PMML. PMML was an experimental inclusion and turned out not to be important.
The web scraper then returned all job listings it could find with those criteria. The search was not perfect, simply because some of the job listing websites were not in a format that fit the scraper. We then searched through each job listing for a lengthy set of keywords. SAS, R, SPSS, etc. were included. Through trial and error several processing rules were developed to identify "R" correctly.
Here are the results from December - each percentage is relative to SAS. In other words, the first line indicates that the term R appeared in 52% as many posting as SAS did.
- SQL, 421%
- Java, 268%
- Excel, 231%
- SAS, 100%
- C++, 97%
- Perl, 70%
- Python, 70%
- Mysql, 62%
- R, 52%
- SPSS, 30%
- Matlab, 11%
- Minitab, 3%
- Stata, 3%
- JMP, 2%
We plan to run another scrape shortly - comments welcome!
(Editor - see also KDnuggets Annual Software Poll: RapidMiner and R vie for first place, June 2013.