Would You Survive the Titanic? A Guide to Machine Learning in Python Part 3
This is the final part of a 3 part introductory series on machine learning in Python, using the Titanic dataset.
Patrick Triest, SocialCops.
Editor's note: This is the third part of a 3 part introductory series on machine learning with Python. Catch up with yesterday's post first if you need.
Computational Brains - An Introduction To Neural Networks
Neural networks are a rapidly developing paradigm for information processing based loosely on how neurons in the brain processes information. A neural network consists of multiple layers of node, where each node performs a unit of computation, and passes the result onto the next node. Multiple nodes can pass inputs to a single node, and vice-versa.
The neural network also contains a set of weights, which can be refined over time as the network learns from sample data. The weights are used to describe and refine the connection strengths between nodes. For instance, in our Titanic data set, node connections transmitting the passenger sex and class will likely be weighted very heavily, since these are important for determining the survival of a passenger.
The major advantage of neural networks over traditional machine learning techniques is their ability to find patterns in unstructured data(such as images or natural language). As such, training a deep neural network on the Titanic dataset is total overkill, but it’s a cool technology to work with so we’re going to do it anyway.
Above we have written the code to build a deep neural network classifier. The “hidden units” of the classifier represent the neural layers we described earlier, with the corresponding numbers representing the size of each layer.
We can also define our own training model to pass to the TensorFlow estimator function as seen above. Our defined model is very basic, for more advanced examples of how to work within this syntax see the skflow documentation here.
Despite the increased power and lengthier runtime of these neural network models, you will notice that the accuracy is still about the same as what we achieved using more traditional tree based methods. The main advantage of neural networks, unsupervised learning of unstructured data, does necessarily lend itself well to our Titanic dataset so this is not too surprising.
I still, however, think that running the passenger data of a 104 year old shipwreck through a cutting edge deep neural network is pretty cool.