Top 10 Python Data Science Libraries
The third part of our series investigating the top Python Libraries across Machine Learning, AI, Deep Learning and Data Science.
Python continues to lead the way when it comes to Machine Learning, AI, Deep Learning and Data Science tasks. According to builtwith.com, 45% of technology companies prefer to use Python for implementing AI and Machine Learning.
Because of this, we’ve decided to start a series investigating the top Python libraries across several categories:
Top 8 Python Machine Learning Libraries
Top 13 Python Deep Learning Libraries
Top 10 Python Data Science Libraries – this post
Top X Python Reinforcement Learning and evolutionary computation Libraries – COMING SOON!
Of course, these lists are entirely subjective as many libraries could easily place in multiple categories. As always, please feel free to vent your frustrations/disagreements/annoyance in the comments section below!
Top 10 Python Data Science Libraries by GitHub Contributors, Commits and Size (size of the circle)
Now, let’s get onto the list (GitHub figures correct as of November 16th, 2018):
1. pandas (Contributors – 1328, Commits – 18162, Stars – 16890)
“pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental highlevel building block for doing practical, real world data analysis in Python.”
2. Matplotlib (Contributors – 771, Commits – 27937, Stars – 8224)
“Matplotlib is a Python 2D plotting library which produces publicationquality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shell (à la MATLAB or Mathematica), web application servers, and various graphical user interface toolkits.”
3. NumPy (Contributors – 708, Commits – 19241, Stars – 8666)
“NumPy is the fundamental package needed for scientific computing with Python. It provides a powerful Ndimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code and useful linear algebra, Fourier transform, and random number capabilities.”
4. SciPy (Contributors – 670, Commits – 20080, Stars – 5096)
“SciPy (pronounced "Sigh Pie") is opensource software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.”
5. Bokeh (Contributors  325, Commits  17365, Stars  8439)
“Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers. With Bokeh, you can quickly and easily create interactive plots, dashboards, and data applications.”
6. Gensim (Contributors  299, Commits  3676, Stars  8107)
“Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.”
7. Scrapy (Contributors – 295, Commits – 6802, Stars – 30014)
“Scrapy is a fast highlevel web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.”
8. StatsModels (Contributors – 164, Commits – 10896, Stars – 3383)
“Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.”
9. plotly.ly (Contributors – 62, Commits – 3291, Stars – 4218)
“plotly.ly is an interactive, opensource, and browserbased graphing library for Python. Built on top of plotly.js, plotly.py is a highlevel, declarative charting library. plotly.js ships with over 30 chart types, including scientific charts, 3D graphs, statistical charts, SVG maps, financial charts, and more.”
10. pydot (Contributors – 12, Commits – 169, Stars – 267)
“pydot is an interface to Graphviz, can parse and dump into the DOT language used by Graphviz and is written in pure Python.”
Keep an eye out for the final part of this series  which focuses on Reinforcement Learning and evolutionary computation libraries  that will be published over the next few weeks!
Resources:
 online
and webbased: Analytics, Data Mining, Data Science, Machine Learning education  Software for Analytics, Data Science, Data Mining, and Machine Learning
Related:
Top Stories Past 30 Days

