- Useful Data Science: Feature Hashing - Jan 28, 2016.
Feature engineering plays major role while solving the data science problems. Here, we will learn Feature Hashing, or the hashing trick which is a method for turning arbitrary features into a sparse binary vector.
- Beyond One-Hot: an exploration of categorical variables - Dec 8, 2015.
Coding categorical variables into numbers, by assign an integer to each category ordinal coding of the machine learning algorithms. Here, we explore different ways of converting a categorical variable and their effects on the dimensionality of data.
- Getting started with Python and Apache Flink - Nov 13, 2015.
Apache Flink built on top of the distributed streaming dataflow architecture, which helps to crunch massive velocity and volume data sets. With version 1.0 it provided python API, learn how to write a simple Flink application in python.