KDnuggets Home » News » 2018 » Jun » Tutorials, Overviews » DIY Deep Learning Projects ( 18:n23 )

DIY Deep Learning Projects

Inspired by the great work of Akshay Bahadur in this article you will see some projects applying Computer Vision and Deep Learning, with implementations and details so you can reproduce them on your computer.



LinkedIn Data Science Community


Akshay Bahadur is one of the great examples that the Data Science community at LinkedIn gave. There are great people in other platforms like Quora, StackOverflow, Youtube, here, and in lots of forums and platforms helping each other in many areas of science, philosophy, math, language and of course Data Science and its companions.

Akshay Bahadur.

But I think in the past ~3 years, the LinkedIn community has excel on sharing great content in the Data Science space, from sharing experiences to detailed posts on how to do Machine Learning or Deep Learning in the real world. I always recommend to people entering in this area to be a part of a community, and LinkedIn is on the best, you will find me there all the time :).


Starting with Deep Learning and Computer Vision



The research in the Deep Learning space for classifying things in images, detecting them and do actions when they “see” something has been very important in this decade, with amazing results like surpassing human level performance for some problems.

In this article I will show you every post Akshay Bahadur has done in the space of Computer Vision (CV) and Deep Learning (DL). If you are not familiar with any of those terms you can learn more about them here:

A “weird” introduction to Deep Learning
There are amazing introductions, courses and blog posts on Deep Learning. But this is a different kind of introduction.towardsdatascience.com

Two months exploring deep learning and computer vision
I decided to develop familiarity with computer vision and machine learning techniques. As a web developer, I found this…towardsdatascience.com

From Neuroscience To Computer Vision
A 50 Year Look At Human And Computer Visiontowardsdatascience.com

Computer Vision by Andrew Ng — 11 Lessons Learned
I recently completed Andrew Ng’s computer vision course on Coursera. Ng does an excellent job at explaining many of the…towardsdatascience.com


1. Hand Movement Using OpenCV

Contribute to HandMovementTracking development by creating an account on GitHub.github.com

From Akshay:

To perform video tracking an algorithm analyzes sequential video frames and outputs the movement of targets between the frames. There are a variety of algorithms, each having strengths and weaknesses. Considering the intended use is important when choosing which algorithm to use. There are two major components of a visual tracking system: target representation and localization, as well as filtering and data association.

Video tracking is the process of locating a moving object (or multiple objects) over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, medical imaging and video editing.

This is all the code you need to reproduce it:

import numpy as np
import cv2
import argparse
from collections import deque


pts = deque(maxlen=64)

Lower_green = np.array([110,50,50])
Upper_green = np.array([130,255,255])
while True:
	ret, img=cap.read()
	mask = cv2.erode(mask, kernel, iterations=2)
	mask = cv2.dilate(mask, kernel, iterations=1)
	center = None
	if len(cnts) > 0:
		c = max(cnts, key=cv2.contourArea)
		((x, y), radius) = cv2.minEnclosingCircle(c)
		M = cv2.moments(c)
		center = (int(M["m10"] / M["m00"]), int(M["m01"] / M["m00"]))
		if radius > 5:
			cv2.circle(img, (int(x), int(y)), int(radius),(0, 255, 255), 2)
			cv2.circle(img, center, 5, (0, 0, 255), -1)
	for i in xrange (1,len(pts)):
		if pts[i-1]is None or pts[i] is None:
		thick = int(np.sqrt(len(pts) / float(i + 1)) * 2.5)
		cv2.line(img, pts[i-1],pts[i],(0,0,225),thick)
	cv2.imshow("Frame", img)
	k=cv2.waitKey(30) & 0xFF
	if k==32:
# cleanup the camera and close any open windows

Yep, 54 lines of code. Very simple right? You will need to have OpenCV installed in your computer, if you have Mac check this out:

Install OpenCV 3 on MacOS
In this post, we will provide step by step instructions for installing OpenCV 3.3.0 (C++ and Python) on MacOS and OSX…www.learnopencv.com

If you have Ubuntu:

OpenCV: Install OpenCV-Python in Ubuntu
Now we have all the required dependencies, let's install OpenCV. Installation has to be configured with CMake. It…docs.opencv.org

and if you have Windows:

Install OpenCV-Python in Windows - OpenCV 3.0.0-dev documentation
In this tutorial We will learn to setup OpenCV-Python in your Windows system. Below steps are tested in a Windows 7-64…docs.opencv.org


2. Drowsiness Detection OpenCV

Contribute to Drowsiness_Detection development by creating an account on GitHub.github.com

This can be used by riders who tend to drive for a longer period of time that may lead to accidents. This code can detect your eyes and alert when the user is drowsy.


  1. cv2
  2. immutils
  3. dlib
  4. scipy

Each eye is represented by 6 (x, y)-coordinates, starting at the left-corner of the eye (as if you were looking at the person), and then working clockwise around the eye:


It checks 20 consecutive frames and if the Eye Aspect ratio is lesst than 0.25, Alert is generated.


Summing up


3. Digit Recognition using Softmax Regression

Digit-Recognizer - Machine Learning classifier for recognizing the digits.github.com

This code helps you classify different digits using softmax regression. You can install Conda for python which resolves all the dependencies for machine learning.


Softmax Regression (synonyms: Multinomial Logistic, Maximum Entropy Classifier, or just Multi-class Logistic Regression) is a generalization of logistic regression that we can use for multi-class classification (under the assumption that the classes are mutually exclusive). In contrast, we use the (standard) Logistic Regression model in binary classification tasks.

Python Implementation

The dataset used was MNIST with images of size 28 X 28, and the plan here is to classify digits from 0 to 9 using Logistic Regression, Shallow Network and Deep Neural Network.

One of the best parts here is that he coded three models using Numpy including optimization, forward and back propagation and just everything.

For Logistic Regression: see code here

For a Shallow Neural Network: see code here

And finally with a Deep Neural Network: see code here

Execution for writing through webcam

To run the code, type python Dig-Rec.py

python Dig-Rec.py

Execution for showing images through webcam

To run the code, type python Digit-Recognizer.py

python Digit-Recognizer.py


Devanagiri Recognition

Devanagiri-Recognizer - Hindi Alphabet classifier using convnetgithub.com

This code helps you classify different alphabets of hindi language (Devanagiri) using Convnets. You can install Conda for python which resolves all the dependencies for machine learning.

Technique Used

I have used convolutional neural networks. I am using Tensorflow as the framework and Keras API for providing a high level of abstraction.


CONV2D → MAXPOOL → CONV2D → MAXPOOL → FC → Softmax → Classification

Some additional points

  1. You can go for additional conv layers.
  2. Add regularization to prevent overfitting.
  3. You can add additional images to the training set for increasing the accuracy.

Python Implementation

Dataset- DHCD (Devnagari Character Dataset) with i mages of size 32 X 32 and usage of Convolutional Network.

To run the code, type python Dev-Rec.py

python Dev-Rec.py


4. Facial Recognition using FaceNet

Facial-Recognition-using-Facenet - Implementation of facial recognition using facenets.github.com

This code helps in facial recognition using facenets (https://arxiv.org/pdf/1503.03832.pdf). The concept of facenets was originally presented in a research paper. The main concepts talked about triplet loss function to compare images of different person. This concept uses inception network which has been taken from source and fr_utils.py is taken from deeplearning.ai for reference. I have added several functionalities of my own for providing stability and better detection.

Code Requirements

You can install Conda for python which resolves all the dependencies for machine learning and you’ll need:



A facial recognition system is a technology capable of identifying or verifying a person from a digital image or a video frame from a video source. There are multiples methods in which facial recognition systems work, but in general, they work by comparing selected facial features from given image with faces within a database.
Functionalities added

  1. Detecting face only when your eyes are opened. (Security measure).
  2. Using face align functionality from dlib to predict effectively while live streaming.

Python Implementation

  1. Network Used- Inception Network
  2. Original Paper — Facenet by Google


  1. If you want to train the network , run Train-inception.py, however you don't need to do that since I have already trained the model and saved it as face-rec_Google.h5 file which gets loaded at runtime.
  2. Now you need to have images in your database. The code check /imagesfolder for that. You can either paste your pictures there or you can click it using web cam. For doing that, run create-face.py the images get stored in /incept folder. You have to manually paste them in /images folder
  3. Run rec-feat.py for running the application.


5. Emojinator

Emojinator - A simple emoji classifier for humans.github.com

This code helps you to recognize and classify different emojis. As of now, we are only supporting hand emojis.

Code Requirements

You can install Conda for python which resolves all the dependencies for machine learning and you’ll need:



Emojis are ideograms and smileys used in electronic messages and web pages. Emoji exist in various genres, including facial expressions, common objects, places and types of weather, and animals. They are much like emoticons, but emoji are actual pictures instead of typographics.


  1. Filters to detect hand.
  2. CNN for training the model.

Python Implementation

  1. Network Used- Convolutional Neural Network


  1. First, you have to create a gesture database. For that, run CreateGest.py. Enter the gesture name and you will get 2 frames displayed. Look at the contour frame and adjust your hand to make sure that you capture the features of your hand. Press 'c' for capturing the images. It will take 1200 images of one gesture. Try moving your hand a little within the frame to make sure that your model doesn't overfit at the time of training.
  2. Repeat this for all the features you want.
  3. Run CreateCSV.py for converting the images to a CSV file
  4. If you want to train the model, run ‘TrainEmojinator.py’
  5. Finally, run Emojinator.py for testing your model via webcam.


Akshay Bahadur and Raghav Patnecha.


Final Words

I can only say I’m incredibly impress on these projects, all of them you can run them on your computer, or more easily on Deep Cognition’s platform if you don’t want to install anything, and it can run online.

I want to thank Akshay and his friends for making this great Open Source contributions and for all the others that will come. Try them, run them, and get inspired. This is only a small example of the amazing things DL and CV can do, and is up to you to take this an turn it into something that can help the world become a better place.
Never give up, we need everyone to be interested in lots of different things. I think we can change the world for the better, improve our lives, the way we work, think and solve problems, and if we channel all the resources we have right now to make these area of knowledge to work together for a greater good, we can make a huge positive impact in the world and our lives.

We need more people interested, more courses, more specializations, more enthusiasm. We need you :)

Thanks for reading this. I hope you found something interesting here :)

If you have questions just add me in twitter:

Favio Vázquez (@FavioVaz) | Twitter
The latest Tweets from Favio Vázquez (@FavioVaz). Data Scientist. Physicist and computational engineer. I have a…twitter.com

and LinkedIn:

Favio Vázquez — Principal Data Scientist — OXXO | LinkedIn
View Favio Vázquez’s profile on LinkedIn, the world’s largest professional community. Favio has 15 jobs jobs listed on…linkedin.com

See you there :)

Bio: Favio Vazquez is a physicist and computer engineer working on Data Science and Computational Cosmology. He has a passion for science, philosophy, programming, and music. Right now he is working on data science, machine learning and big data as the Principal Data Scientist at Oxxo. Also, he is the creator of Ciencia y Datos, a Data Science publication in Spanish. He loves new challenges, working with a good team and having interesting problems to solve. He is part of Apache Spark collaboration, helping in MLlib, Core and the Documentation. He loves applying his knowledge and expertise in science, data analysis, visualization, and automatic learning to help the world become a better place.

Original. Reposted with permission.