U. Chicago Center for Data Science and Public Policy: Postdoc in Natural Language Processing

Are you interested in using your natural language processing skills to make a social impact? Want to work with the White House and a team from government, academia, and industry to change how job training programs are created all over the US?

U. ChicagoCompany: Center for Data Science and Public Policy at U. Chicago
Location: Chicago, IL, USA
Web: dsapp.uchicago.edu/
Position: Postdoc Position In Natural Language Processing

Apply online.

Are you interested in using your natural language processing skills to make a social impact? Want to work with the White House and a team from government, academia, and industry to change how job training programs are created all over the US? The Center for Data Science & Public Policy at the University of Chicago has Post-Doctoral Fellowships for post-docs with a strong background in machine learning and natural language processing and a passion for social impact and public policy. This Post-Doc Fellow will lead the NLP work on our Workforce Science project, a part of the White House’s Workforce Data Initiative.


We invite applicants for a postdoctoral research position in natural language processing applied to workforce science, working on an interdisciplinary research project to discover, identify, extract, and normalize skills and competencies required for every job in the US economy. The goal is to study the connections between the skills and competencies provided by educational programs, obtained by job seekers, and required by employers and create a publicly available system that government agencies and corporations can use to create training/educational programs for people most in need.

The researcher will help develop state-of-the-art methods and models for better word and phrase embeddings for skills from unstructured text, classifying skills and competencies in job postings, resumes, and course descriptions, and inferring hierarchies and linkages between skills and jobs. The primary responsibility will be developing and testing algorithms that can be used in creating a new resource on skills for the entire US economy, and working with researchers from across academia and industry to improve the quality and usability of the generated data. Additionally, you will be expected to produce research outputs such as peer-reviewed publications. You will be providing technical oversight and mentoring to other programmers and data scientists who build and maintain our software and hardware infrastructure, and who do the majority of the data collection and analysis for the White House Workforce Data Initiative.

The ideal candidate will have research experience in NLP, computational linguistics and applied machine learning with an interest in cross-disciplinary collaboration with social scientists.. Prior experience with neural networks, word vectors, ladder networks, and semi-supervised learning and active learning techniques is strongly preferred. Experience handling large amounts of text data as well as parallel ML architectures is a big plus since we will be dealing with hundreds of millions of job postings and resumes. The eventual goal will be to to further our understanding of labor markets and produce a data resource that government agencies and corporations can use to improve their training and hiring programs. Methods and results will be published in high impact computer science and machine learning venues and, via collaboration with economists, sociologists, government statisticians, and public policy researchers. See dsapp.uchicago.edu for example publications.


  • Strong academic record and PhD (finished or expected soon) in computer science, machine learning, statistics, computational or quantitative physical/social sciences or related areas
  • Experience working on real-world problems and passion for making a social impact
  • Experience with natural language processing and deep learning methods
  • Experience creating and tuning word, phrase, and document embeddings
  • Experience building classifiers on word embeddings, especially with limited negative cases
  • Experience with Spark, MLLib, and processing text at scale
  • Strong programming skills (ideally in Python)
  • Strong database skills
  • Data analysis, machine learning, data mining skills


Applications should be submitted using this link  and include a CV, expected date of availability and contact information for three references. Applications will continue to be accepted until positions are filled.


The Center for Data Science and Public Policy at the University of Chicago is a joint initiative of The Harris School of Public Policy and the Computation Institute that seeks to further the use of data science in policy research and practice. The center consists of data scientists with interdisciplinary backgrounds (computer science, statistics, math, physics, economics, social sciences), policy researchers, and practitioners who work together with external partners to solve problems with social impact. More information about our past work can be found at: http://dsapp.uchicago.edu


The White House Workforce Data Initiative (WDI) seeks to support the data architecture for the next generation of workforce innovation and increase interoperability between public and private sector workforce data by facilitating standards, structure, and open access points for data underlying training, skills, jobs, and wages. The three strategic pillars of the WDI are to 1) create a dynamic, common, locally relevant language for skills using public and private data, 2) improve available information on education and training outcomes through the creation of Workforce Training Provider Scorecards, and 3) support an ecosystem of industry partners building new products on top of these new data resources and standards. More information about the work of WDI can be found here: dataatwork.org