Silver Blog, May 2017The Best Python Packages for Data Science

This report is the second in a series analyzing data science related topics. This time around, specifically, we rank 15 top Python data science packages, hopefully with results of use to the data science community.



By The Data Incubator.

This report was originally published on The Data Incubator Blog. You can view the the report in it’s entirety here: Ranked 15 Python Packages for Data Science

Best Python Packages

The most frequently asked question in our data science training program is "what is the best programming language for machine learning?"

The resulting discussion, depending on the day, either ends in a hotly contested debate between R, Python, and MATLAB fans, or a full on WWE wrestling match.

Ultimately the programming language of choice for machine learning comes down to three criteria:

  • The type of problem the data scientist is working with
  • Personal programming preferences
  • The type of machine learning they're looking to perform

In other words, it depends. However, there is no doubt Python is a language of choice for a large percentage of data scientists who want to understand data, especially those looking to leverage its great data science packages. Python is also boasts being open source which is great for anyone looking to get started with data science in their spare time.

About the report

At The Data Incubator we pride ourselves on having the latest data science curriculum. Much of our course material is based on feedback from corporate and government partners about the technologies they are looking to learn. However, we wanted to develop a more data-driven approach to what we teach in our data science corporate training and our free fellowship forData science masters and PhDs looking to begin their careers in the industry.

This report is the second in a series analyzing data science related topics, to see more be sure to check out our R Packages for Machine Learning report. We thought it would be useful to the data science community to rank and analyze a variety of topics related to the profession in a simple, easy to digest cheat sheet, rankings or reports.

This report ranks Python packages for Data Science, and we’re hoping to stir the pot a bit and get our colleagues to join the discussion. Our discoveries here aren’t final, but rather serve to showcase the depth, and the breadth, of knowledge available to the data science community.

Python, along with R, is one of the most popular tools in a data scientist’s arsenal mostly for it’s simplicity and ease of use- most concepts can be expressed in fewer lines of code in Python, than in other languages.

The rankings chart:

Rankings chart

Related: