NYU Data Science Program – Things to Know Part 2

NYU Data Science program reviewed from inside, including courses on Machine Learning, Big Data, Deep Learning, top professors, great NYC location, and future plans.

By Ran Bi, June 2014

Here is Part 1

Three core courses were provided this spring, which are Machine Learning by Prof. David Sontag, Big Data by Prof. Juliana Freire, and Deep Learning by Prof. Yann LeCun. It is pretty exciting, since the three courses are all talking about the hottest topics in data science.

Machine Learning

If we had a vote of the most popular professor in data science program, I bet it would be Prof. David Sontag. Also, if there were one for best TA, it would definitely belong to Yoni. This course provides a good introduction to machine learning topics, like SVM, Learning theory, Bayesian Methods, etc. But this course may be easy for people with strong computer science background. It is part of a two course series, and the second course, Inference and Representation, will be offered in Fall 2014.

I enjoyed David’s passionate lectures and also learned from his assignments, which I think could be treated as mini-projects. The most exciting night of poster section came at the end of semester when all students gathered and present their project posters. Topics from the lectures were applied to various kinds of interesting projects like face recognition, music detection, financial price prediction, etc. Here are some pictures of my classmates from that night.



Big Data

Female professors are a rare sight in data science. I have met two here, one is Prof. Claudia Perlich  who taught intro to data science, and another is Prof. Juliana Freire who is Graduate Study Director of CDS and also taught us big data. We learned big data algorithms, and applied tools to practical dataset. To name a few, we explored MTA subway fare data using VisTrails and SQL, wrote MapReduce program using python and ran on AWS with 100,000 Wikipedia documents, and also implemented Pig Latin queries on tweets data.

There are guest lectures presented by John Langford on fast learning algorithm (see my post on Vowpal Wabbit), and also by Leon Bottou on large scale machine learning. The Big Data labs, on the other hand, need to have more clear objectives. I believe it will be better in the future. In a nutshell, this course introduces some basic ideas of big data, but not in much depth.

New York Subway stations visualization

A screenshot from my assignment. Each point represents a subway station in NYC, and the redder means more fares are sold in Feb 1st than in Jan 18th.

Deep Learning

Around 2011, companies like Google, IBM and Microsoft started to become interested in deep learning, after its efficiency was proved by results in speech recognition. On the first deep learning lecture, Yann showed us the amazing demo of a deep learning system, which was trained to recognize images. As one of the leading experts in deep learning, Prof. Yann LeCun gave informative lectures which covered almost all the hottest topics in deep learning. Torch is used for this class and our projects.


The program is growing and changing according to our feedback. “We are hiring instructor for ‘Programming for Data Science’“, Roy said. The course is for people who are not super good at programming and mainly talks about python, which we mostly use for required courses. “We are also recruiting Senior Director and 18 more faculty will be hired in the future.” People who are interested in these positions can see more details on CDS website.

Facebook reflected in Watson

“Within a mile from where you are sitting, we have Google, Watson, Facebook new lab, probably a hundred startups. And we have partnership with Facebook, Yahoo Research, and Microsoft.” There is still a long way for data science to become a well-defined discipline. But I feel excited to be a witness and a participant right now.

A picture taken in Astor Place, which is 5 minutes walk from CDS. The blue building will contain IBM’s new Watson Division. The reflected building contains the NYC Facebook office.