Python Data Science with Pandas vs Spark DataFrame: Key Differences
A post describing the key differences between Pandas and Spark's DataFrame format, including specifics on important regular processing features, with code samples.
on Jan 29, 2016 in Apache Spark, Pandas, Python
Deep Learning with Spark and TensorFlow
The integration of TensorFlow with Spark leverages the distributed framework for hyperparameter tuning and model deployment at scale. Both time savings and improved error rates are demonstrated.
on Jan 28, 2016 in Apache Spark, Deep Learning, Distributed Systems, TensorFlow
How to Check Hypotheses with Bootstrap and Apache Spark
Learn how to leverage bootstrap sampling to test hypotheses, and how to implement in Apache Spark and Scala with a complete code example.
on Jan 28, 2016 in Apache Spark, Bootstrap sampling, Dmitry Petrov, Statistical Analysis
Implementing Your Own k-Nearest Neighbor Algorithm Using Python
A detailed explanation of one of the most used machine learning algorithms, k-Nearest Neighbors, and its implementation from scratch in Python. Enhance your algorithmic understanding with this hands-on coding exercise.
on Jan 27, 2016 in K-nearest neighbors, Python, Python Tutorial
7 Common Data Science Mistakes and How to Avoid Them
Data scientist in business is as similar as to that of a detective: discovering the unknown. But, while venturing onto this journey they do tend to fall into the pitfalls. Understand, how these mistakes are made and how you can avoid them.
on Jan 26, 2016 in Data Science, Khushbu Shah, Mistakes
Learning to Code Neural Networks
Learn how to code a neural network, by taking advantage of someone else's experiences learning how to code a neural network.
on Jan 22, 2016 in Backpropagation, Denny Britz, Neural Networks, Python
Scikit-learn and Python Stack Tutorials: Introduction, Implementing Classifiers
A small collection of introductory scikit-learn and Python stack tutorials for those with an existing understanding of machine learning looking to jump right into using a new set of tools.
on Jan 18, 2016 in IPython, Python, scikit-learn, Tutorials
Hitchhikers Guide to Azure Machine Learning Studio
Learn Azure ML Studio through this brief hands-on tutorial. This step-by-step guide will help you get a quick-start and grasp the basics of this Predictive Modeling tool.
on Jan 15, 2016 in AWS, Azure ML, Decision Trees, edX, Machine Learning, Web services
Attention and Memory in Deep Learning and NLP
An overview of attention mechanisms and memory in deep neural networks and why they work, including some specific applications in natural language processing and beyond.
on Jan 12, 2016 in Deep Learning, Machine Translation, NLP, Recurrent Neural Networks
7 Steps to Understanding Deep Learning
There are many deep learning resources freely available online, but it can be confusing knowing where to begin. Go from vague understanding of deep neural networks to knowledgeable practitioner in 7 steps!
on Jan 11, 2016 in 7 Steps, Caffe, Convolutional Neural Networks, Deep Learning, Matthew Mayo, Recurrent Neural Networks, TensorFlow, Theano
Understanding Rare Events and Anomalies: Why streaks patterns change
We often look back at the past year and an overall history of rare events, and try to then extrapolate future odds of the same rare event, based on that. We illustrate here, that rare past events have no usefulness in understanding the rarity of the same events in the future!
on Jan 8, 2016 in Anomaly Detection, Predictions, S&P 500