25 Github Repositories Every Python Developer Should Know

Check out these repositories to help you improve your data science skills.

By Abhay Parashar, Machine Learning Enthusiast

Photo by heylagostechie on Unsplash

Have you ever been stuck asking yourself questions like:

  1. What does code look like that’s written at FAANG companies?
  2. How can I write code like them?
  3. I have learned this, now what?

Well, the answer to all your questions is Github.


What is Github?

Learning how to code is easy but learning how to write better code is tough. Github can show you exactly what you need to know. It is like a Goldmine for developers where gold is the code written by other developers. With the help of GitHub, you can learn how to write better code, how good code looks, and the steps you need to follow to become a better developer.


Did You Know?


According to Stackoverflow, Python is the most preferred language.

It is GitHub’s second-most popular language.

There are over 147,000+ packages in Python’s package repository.

It is reported as one of the most used and best tools for data science.



Most of the repositories included in this article are based on data science and machine learning. Let’s divide the list of repositories into five parts.

  1. Learning
  2. Books
  3. Projects
  4. Interview Preparation
  5. Frameworks, Modules and Tools




1. The Algorithms — Python By The Algorithms

As the repository name explains itself, this repo contains almost every algorithm that you will require ever. You can even install the repo as a package using pip install algorithms

An example of merge sort using the repo package.

This Llbrary is not limited to algorithms. It also contains different operations of the matrix, graphs, etc.

Stats : (109k+ ⭐) (30.1k+ Forked)


2. vinta/awesome-python

A curated list of awesome Python frameworks, libraries, software, and resources.

The repository is self-explanatory, but if you find it hard to understand, they have their own website with awesome GUI.

Stats : (99k+ ⭐) (19k+ Forked)


3. jerry-git/learn-python3

This repository is a collection of Jupyter notebooks to learn python. This is best for a newbie to python who wants to gets their hands dirty by solving problems.

Each Notebook contains a bit of theory, code, and coding exercises.

Stats : (3.9k+ ⭐) (1k+ Forked)


4. trekhleb/learn-python

???? Playground and cheatsheet for learning Python. Collection of Python scripts that are split by topics and contain code examples with explanations.

It is another great repository to learn python by topics.

Stats : (7.5k+ ⭐) (1.4k+ Forked)


5. Avik-Jain/100-Days-Of-ML-Code

This repository is best for all data science learners. It has a total of 100 days of code with different topics and algorithms.

The notebooks available in the repo are easy to understand and self-explanatory.

Stats : (32.2k+ ⭐) (8.1k+ Forked)




6. Hitchhiker’s Guide to Python:

A best practice handbook to the installation, configuration, and usage of Python on a daily basis.

It includes Pip, Numpy, scipy, statpy, pyplot, matplotlib, Server configurations and tools for various web frameworks, Virtualenv, and many more topics.

Stats : (23k+ ⭐) (5.6k+ Forked)


7. Cosmic Python:

A book about Pythonic Application Architecture Patterns for Managing Complexity.

Stats : (1.9k+ ⭐) (290+ Forked)


8. Byte of Python:

“A Byte of Python” is a free book on programming using the Python language. It serves tutorials for beginners' audience to programming. If all you know about computers is how to save text files, then this is the book for you.

Stats : (1.5k+ ⭐) (894+ Forked)


9. Python Machine Learning:

It is the code repository for the classic Python Machine Learning Book. It contains code for each chapter.

Stats : (2.2k+ ⭐) (967+ Forked)


Open Source Projects



A command-line tool that instantly fetches Stack Overflow results when an exception is thrown. Just use rebound while running the program.



Stats : (3.6k+ ⭐) (336+ Forked)


11. openai/gym

It is an open-source toolkit for developing and comparing reinforcement learning algorithms. It is compatible with any numerical computation library, such as TensorFlow or Theano.

Here are the docs on their site also See the FAQ for information.

Stats : (24.3k+ ⭐) (7k+ Forked)


12. facebookresearch/Detectron

Dectectron is a Facebook AI research team software for object detection. It can easily implement state-of-the-art object detection algorithms, including Mask R-CNN.

It is written in Python and powered by caffe2 deep learning framework.



Stats : (24.4k+ ⭐) (5.3k+ Forked)


13. iperov/DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes. More than 95% of videos of deep fakes on the internet are created using deep face lab.

With the help of deep fakes, you can change the face, De-age the face, Replace the head, manipulate lips, and more.

Stats : (26.5k+ ⭐) (6k+ Forked)


14. ageitgey/face_recognition

Best library for building face recognition applications. It is one of the simplest facial recognition APIs for Python and the command line.

The face recognition library generates a total number of 128 digital prints for each face it detects. Later these prints are encoded in some vector encodings that can be used later to decode the prints and compare them to fetch the label(name) of the person.

Stats : (40.1k+ ⭐) (11.2k+ Forked)


15. You Get by Mort Yao

It is a tiny command-line utility to download media contents (videos, audios, images) from the Web.

pip install you-get


Stats : (40.5k+ ⭐) (8.4k+ Forked)


Interview Preparation


16. donnemartin/interactive-coding-challenges

120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.

It has programming questions related to arrays, linked lists, graphs, recursion, and more.

Stats : (22.7k+ ⭐) (3.6k+ Forked)


17. learning-zone/python-interview-questions

A list of 300 Python Interview questions with solutions. It also contains many programming questions solutions like hash maps.

Stats : (313+ ⭐) (85+ Forked)


18. zhiwehu/Python-programming-exercises

100+ Python challenging programming exercises with different levels.

Stats : (15.7k+ ⭐) (5.8k+ Forked)


19. MTrajK/coding-problems

This repository contains solutions for various coding/algorithmic problems and many useful resources for learning algorithms and data structures.

It contains problems and solutions for Arrays, Linked List, Trees, Hashing DS, Dynamic Programming, Strings, Math, and others.

Stats : (1.9k+ ⭐) (348+ Forked)


Frameworks, Modules, and Tools

The Packages mentioned below can be useful for you to understand how code is written in the big projects developed by big giants. By taking a glimpse of the codes available in the repos you can easily improve your coding skills.


20. tensorflow/tensorflow

Tensorflow is an official Google open-source platform for end-to-end machine learning. It has a comprehensive, flexible ecosystem of tools, libraries that gives developers the power to build and deploy ML apps easily.

It provides a stable version in python. It can be easily installed using pip.

Stats : (156k+ ⭐) (84.8+ Forked)


21. Dash by Plotly

A Python framework for Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.

It is the most trusted and downloaded python package for building ML and data science apps.

It is built on top of plotly.js, which is also a great package for data visualization.

Stats : (14.6k+ ⭐) (1.5k+ Forked)


22. streamlit/streamlit

Streamlit provides The fastest way to build data apps in Python. Streamlit lets you turn data scripts into sharable web apps in minutes, not weeks.

It’s all Python, open-source, and free! And once you’ve created an app you can use their free sharing platform to deploy, manage, and share your app with the world.

Stats : (14.7k+ ⭐) (1.3k+ Forked)


23. scikit-learn/scikit-learn

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license.

It is one of the most used and famous modules for performing machine learning tasks. It has a variety of algorithms and data analysis concepts prebuild.

Stats : (45.8k+ ⭐) (21.5k+ Forked)


24. mwaskom/seaborn

Seaborn is a python library for statistical data visualization builds on top of the matplotlib library. Seaborn provides a variety of visualization patterns and plots. It uses simple syntax and good-looking visualization like box plots, count plots, violin plots, histograms, and more.

Stats : (8.5k+ ⭐) (1.4k+ Forked)


25. numpy/numpy

NumPy is the fundamental package needed for scientific computing with Python.

It stands for Numerical Python and is a python library for all kinds of scientific calculation. It consists of many multidimensional arrays and a collection of routines to process them.

It adds additional support for matrices and large multidimensional arrays by adding a large collection of high-level mathematical functions.

Stats : (17.3k+ ⭐) (5.6k+ Forked)


Bonus Repositories


1. Project-Based Learning/tuvtran

The repository is packed with different tutorials for different programming languages like python, go, PHP, Java, etc a total of 20 programming languages. The main aim of the repo is to focus on Project Based Learning. Their Python section includes tons of tutorials for building a host of projects from web scrapers, bots, and web applications to building Data Science, Machine Learning, and Deep Learning solutions.

Stats : (50.6k+ ⭐) (7.9k+ Forked)


2. public-apis/public-apis

A collective list of free APIs for use in software and web development.

Stats : (126k+ ⭐) (15.4k+ Forked)


3. EbookFoundation/free-programming-books

It contains a list of free programming books for learning. It has over 1.5+ contributors and over 10,000 Free books pdf. It supports many different languages like Chinese, Dutch, Russian, Italian, and more.

Stats : (190k+ ⭐) (42.4k+ Forked)


Some Handpicked Articles From The Author ✍

A Quick Look At The Object Oriented Programming In Python

10 Advance Python Concepts To Level Up Your Python Skills

10 Facts You didn't Know About Python

10 Python Tricks For Speed Up Your Code

10 Must Known Build In Functions In Python

15 Python Packages You Probably didn't know Existed

The 7 Stages For Preparing Data For Machine Learning


Level Up Coding

Thanks for being a part of our community! Subscribe to our YouTube channel or join the Skilled.dev coding interview course.

Coding Interview Questions + Land Your Dev Job | Skilled.dev

Bio: Abhay Parashar is a machine learning enthusiast who Is looking forward to building a career in the field of artificial intelligence.

Original. Reposted with permission.