25 Github Repositories Every Python Developer Should Know
Check out these repositories to help you improve your data science skills.
By Abhay Parashar, Machine Learning Enthusiast
Photo by heylagostechie on Unsplash
Have you ever been stuck asking yourself questions like:
- What does code look like that’s written at FAANG companies?
- How can I write code like them?
- I have learned this, now what?
Well, the answer to all your questions is Github.
What is Github?
Learning how to code is easy but learning how to write better code is tough. Github can show you exactly what you need to know. It is like a Goldmine for developers where gold is the code written by other developers. With the help of GitHub, you can learn how to write better code, how good code looks, and the steps you need to follow to become a better developer.
Did You Know?
According to Stackoverflow, Python is the most preferred language.
It is GitHub’s second-most popular language.
There are over 147,000+ packages in Python’s package repository.
It is reported as one of the most used and best tools for data science.
Most of the repositories included in this article are based on data science and machine learning. Let’s divide the list of repositories into five parts.
- Interview Preparation
- Frameworks, Modules and Tools
As the repository name explains itself, this repo contains almost every algorithm that you will require ever. You can even install the repo as a package using
pip install algorithms
An example of merge sort using the repo package.
This Llbrary is not limited to algorithms. It also contains different operations of the matrix, graphs, etc.
Stats : (109k+ ⭐) (30.1k+ Forked)
A curated list of awesome Python frameworks, libraries, software, and resources.
The repository is self-explanatory, but if you find it hard to understand, they have their own website with awesome GUI.
Stats : (99k+ ⭐) (19k+ Forked)
This repository is a collection of Jupyter notebooks to learn python. This is best for a newbie to python who wants to gets their hands dirty by solving problems.
Each Notebook contains a bit of theory, code, and coding exercises.
Stats : (3.9k+ ⭐) (1k+ Forked)
📚 Playground and cheatsheet for learning Python. Collection of Python scripts that are split by topics and contain code examples with explanations.
It is another great repository to learn python by topics.
Stats : (7.5k+ ⭐) (1.4k+ Forked)
This repository is best for all data science learners. It has a total of 100 days of code with different topics and algorithms.
The notebooks available in the repo are easy to understand and self-explanatory.
Stats : (32.2k+ ⭐) (8.1k+ Forked)
A best practice handbook to the installation, configuration, and usage of Python on a daily basis.
It includes Pip, Numpy, scipy, statpy, pyplot, matplotlib, Server configurations and tools for various web frameworks, Virtualenv, and many more topics.
Stats : (23k+ ⭐) (5.6k+ Forked)
A book about Pythonic Application Architecture Patterns for Managing Complexity.
Stats : (1.9k+ ⭐) (290+ Forked)
“A Byte of Python” is a free book on programming using the Python language. It serves tutorials for beginners' audience to programming. If all you know about computers is how to save text files, then this is the book for you.
Stats : (1.5k+ ⭐) (894+ Forked)
It is the code repository for the classic Python Machine Learning Book. It contains code for each chapter.
Stats : (2.2k+ ⭐) (967+ Forked)
Open Source Projects
A command-line tool that instantly fetches Stack Overflow results when an exception is thrown. Just use rebound while running the program.
Stats : (3.6k+ ⭐) (336+ Forked)
It is an open-source toolkit for developing and comparing reinforcement learning algorithms. It is compatible with any numerical computation library, such as TensorFlow or Theano.
Stats : (24.3k+ ⭐) (7k+ Forked)
Dectectron is a Facebook AI research team software for object detection. It can easily implement state-of-the-art object detection algorithms, including Mask R-CNN.
It is written in Python and powered by caffe2 deep learning framework.
Stats : (24.4k+ ⭐) (5.3k+ Forked)
DeepFaceLab is the leading software for creating deepfakes. More than 95% of videos of deep fakes on the internet are created using deep face lab.
With the help of deep fakes, you can change the face, De-age the face, Replace the head, manipulate lips, and more.
Stats : (26.5k+ ⭐) (6k+ Forked)
Best library for building face recognition applications. It is one of the simplest facial recognition APIs for Python and the command line.
The face recognition library generates a total number of 128 digital prints for each face it detects. Later these prints are encoded in some vector encodings that can be used later to decode the prints and compare them to fetch the label(name) of the person.
Stats : (40.1k+ ⭐) (11.2k+ Forked)
It is a tiny command-line utility to download media contents (videos, audios, images) from the Web.
pip install you-get
Stats : (40.5k+ ⭐) (8.4k+ Forked)
120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.
It has programming questions related to arrays, linked lists, graphs, recursion, and more.
Stats : (22.7k+ ⭐) (3.6k+ Forked)
A list of 300 Python Interview questions with solutions. It also contains many programming questions solutions like hash maps.
Stats : (313+ ⭐) (85+ Forked)
100+ Python challenging programming exercises with different levels.
Stats : (15.7k+ ⭐) (5.8k+ Forked)
This repository contains solutions for various coding/algorithmic problems and many useful resources for learning algorithms and data structures.
It contains problems and solutions for Arrays, Linked List, Trees, Hashing DS, Dynamic Programming, Strings, Math, and others.
Stats : (1.9k+ ⭐) (348+ Forked)
Frameworks, Modules, and Tools
The Packages mentioned below can be useful for you to understand how code is written in the big projects developed by big giants. By taking a glimpse of the codes available in the repos you can easily improve your coding skills.
Tensorflow is an official Google open-source platform for end-to-end machine learning. It has a comprehensive, flexible ecosystem of tools, libraries that gives developers the power to build and deploy ML apps easily.
It provides a stable version in python. It can be easily installed using pip.
Stats : (156k+ ⭐) (84.8+ Forked)
It is the most trusted and downloaded python package for building ML and data science apps.
It is built on top of plotly.js, which is also a great package for data visualization.
Stats : (14.6k+ ⭐) (1.5k+ Forked)
Streamlit provides The fastest way to build data apps in Python. Streamlit lets you turn data scripts into sharable web apps in minutes, not weeks.
It’s all Python, open-source, and free! And once you’ve created an app you can use their free sharing platform to deploy, manage, and share your app with the world.
Stats : (14.7k+ ⭐) (1.3k+ Forked)
scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license.
It is one of the most used and famous modules for performing machine learning tasks. It has a variety of algorithms and data analysis concepts prebuild.
Stats : (45.8k+ ⭐) (21.5k+ Forked)
Seaborn is a python library for statistical data visualization builds on top of the matplotlib library. Seaborn provides a variety of visualization patterns and plots. It uses simple syntax and good-looking visualization like box plots, count plots, violin plots, histograms, and more.
Stats : (8.5k+ ⭐) (1.4k+ Forked)
NumPy is the fundamental package needed for scientific computing with Python.
It stands for Numerical Python and is a python library for all kinds of scientific calculation. It consists of many multidimensional arrays and a collection of routines to process them.
It adds additional support for matrices and large multidimensional arrays by adding a large collection of high-level mathematical functions.
Stats : (17.3k+ ⭐) (5.6k+ Forked)
The repository is packed with different tutorials for different programming languages like python, go, PHP, Java, etc a total of 20 programming languages. The main aim of the repo is to focus on Project Based Learning. Their Python section includes tons of tutorials for building a host of projects from web scrapers, bots, and web applications to building Data Science, Machine Learning, and Deep Learning solutions.
Stats : (50.6k+ ⭐) (7.9k+ Forked)
A collective list of free APIs for use in software and web development.
Stats : (126k+ ⭐) (15.4k+ Forked)
It contains a list of free programming books for learning. It has over 1.5+ contributors and over 10,000 Free books pdf. It supports many different languages like Chinese, Dutch, Russian, Italian, and more.
Stats : (190k+ ⭐) (42.4k+ Forked)
Some Handpicked Articles From The Author ✍
Level Up Coding
Bio: Abhay Parashar is a machine learning enthusiast who Is looking forward to building a career in the field of artificial intelligence.
Original. Reposted with permission.
- The Machine & Deep Learning Compendium Open Book
- Going Beyond the Repo: GitHub for Career Growth in AI & Machine Learning
- How to Build Strong Data Science Portfolio as a Beginner