Inside the Mind of a Neural Network with Interactive Code in Tensorflow
Understand the inner workings of neural network models as this post covers three related topics: histogram of weights, visualizing the activation of neurons, and interior / integral gradients.
Interior Gradients / Positive and Negative Attributes
alpha value of 1.0
Now lets use interior gradient, with the alpha values of ….
0.01, 0.01, 0.03, 0.04, 0.1, 0.5, 0.6, 0.7, 0.8 and 1.0 (same as original input), to visualize the the inner workings of the network.
All of the images above represents the gradient respect to the input with different alpha values. We can observe as alpha value increases the gradients becomes closer to 0 (black) due to the saturation within the network. We can clearly see the difference of how gradients change when we view all of the images in a gif format.
However, I was able to make an interesting observation, as alpha value becomes smaller my network wasn’t able to make the correct predictions. As seen below when the alpha value is small (0.01) the network predict mostly 3 or 4. (Cat/Deer)
Red Box → Models Prediction
But as alpha value increased, eventually ending up at 1.0, we can observe that the accuracy of the model increases. Finally, lets take a look at positive and negative attributes from the gradient at each alpha values.
Blue Pixel → Positive Attributes Overlay-ed with Original Image
Red Pixel → Negative Attributes Overlay-ed with the Original Image
Again, it is easier to view the change if we create a gif out of the images.
Integral Gradient / Positive and Negative Attributes
Left Image → Integral Gradient with Step Size 3000 for riemann approximation
Right Image → Blue Color for Positive Attributes and Red Color for Negative Attributes
Finally when we use Integral Gradient to visualize the gradients we can observe something like above. One very interesting fact I was able to find was how the network predicted horse (7), dog (5) and deer (4).
Lets first take a look at the original images, as seen above the left, middle images are horses and right image is a sheep.
When we visualize the Integral Gradient we can see something interesting. As seen in the middle image the network thought the head portion of a horse was most important. However, when only taking the head, the network thought it was an image of a dog. (And it kinda looks like a dog). Similar story for the right image, when the network only taken into account of only the head of a sheep, it predicts as seeing a dog.
However, as seen on the most left image, when the network takes in the overall shape of a horse (including some portion of a human) it correctly identifies the image as a horse. AMAZING!
Interactive Code
For Google Colab, you would need a google account to view the codes, also you can’t run read only scripts in Google Colab so make a copy on your play ground. Finally, I will never ask for permission to access your files on Google Drive, just FYI. Happy Coding! Also for transparency I uploaded all of the training logs on my github.
To access the code for this post please click here, for training logs click here.
Final Words
As this area is still growing and being developed there will be new methods and findings. (I hope I can contribute to those findings as well.). Finally, if you want to access the code from the original authors of the paper “Axiomatic Attribution for Deep Networks” please click here.
If any errors are found, please email me at jae.duk.seo@gmail.com, if you wish to see the list of all of my writing please view my website here.
Meanwhile follow me on my twitter here, and visit my website, or my Youtube channel for more content. I also implemented Wide Residual Networks, please click here to view the blog post.
Reference
- http://Axiom/definition/theorem/proof. (2018). YouTube. Retrieved 14 June 2018, from https://www.youtube.com/watch?v=OeC5WuZbNMI
- axiom — Dictionary Definition. (2018). Vocabulary.com. Retrieved 14 June 2018, from https://www.vocabulary.com/dictionary/axiom
- Axiomatic Attribution for Deep Networks — — Mukund Sundararajan, Ankur Taly, Qiqi Yan. (2017). Vimeo. Retrieved 14 June 2018, from https://vimeo.com/238242575
- STL-10 dataset. (2018). Cs.stanford.edu. Retrieved 15 June 2018, from https://cs.stanford.edu/~acoates/stl10/
- Benenson, R. (2018). Classification datasets results. Rodrigob.github.io. Retrieved 15 June 2018, from http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#53544c2d3130
- [duplicate], H. (2018). How to hide axes and gridlines in Matplotlib (python). Stack Overflow. Retrieved 16 June 2018, from https://stackoverflow.com/questions/45148704/how-to-hide-axes-and-gridlines-in-matplotlib-python
- VanderPlas, J. (2018). Multiple Subplots | Python Data Science Handbook. Jakevdp.github.io. Retrieved 16 June 2018, from https://jakevdp.github.io/PythonDataScienceHandbook/04.08-multiple-subplots.html
- matplotlib.pyplot.subplot — Matplotlib 2.2.2 documentation. (2018). Matplotlib.org. Retrieved 16 June 2018, from https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplot.html
- matplotlib, m. (2018). more than 9 subplots in matplotlib. Stack Overflow. Retrieved 16 June 2018, from https://stackoverflow.com/questions/4158367/more-than-9-subplots-in-matplotlib
- pylab_examples example code: subplots_demo.py — Matplotlib 2.0.0 documentation. (2018). Matplotlib.org. Retrieved 16 June 2018, from https://matplotlib.org/2.0.0/examples/pylab_examples/subplots_demo.html
- matplotlib.pyplot.hist — Matplotlib 2.2.2 documentation. (2018). Matplotlib.org. Retrieved 16 June 2018, from https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html
- Python?, I. (2018). Is there a clean way to generate a line histogram chart in Python?. Stack Overflow. Retrieved 16 June 2018, from https://stackoverflow.com/questions/27872723/is-there-a-clean-way-to-generate-a-line-histogram-chart-in-python
- Pyplot Text — Matplotlib 2.2.2 documentation. (2018). Matplotlib.org. Retrieved 16 June 2018, from https://matplotlib.org/gallery/pyplots/pyplot_text.html#sphx-glr-gallery-pyplots-pyplot-text-py
- [ Google / ICLR 2017 / Paper Summary ] Gradients of Counterfactuals. (2018). Towards Data Science. Retrieved 16 June 2018, from https://towardsdatascience.com/google-iclr-2017-paper-summary-gradients-of-counterfactuals-6306510935f2
- numpy.transpose — NumPy v1.14 Manual. (2018). Docs.scipy.org. Retrieved 16 June 2018, from https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.transpose.html
- command, G. (2018). Git add and commit in one command. Stack Overflow. Retrieved 16 June 2018, from https://stackoverflow.com/questions/4298960/git-add-and-commit-in-one-command
- How to slice an image into red, g. (2018). How to slice an image into red, green and blue channels with misc.imread. Stack Overflow. Retrieved 16 June 2018, from https://stackoverflow.com/questions/37431599/how-to-slice-an-image-into-red-green-and-blue-channels-with-misc-imread
- How to install Bash shell command-line tool on Windows 10. (2016). Windows Central. Retrieved 16 June 2018, from https://www.windowscentral.com/how-install-bash-shell-command-line-windows-10
- Google Colab Free GPU Tutorial — Deep Learning Turkey — Medium. (2018). Medium. Retrieved 16 June 2018, from https://medium.com/deep-learning-turkey/google-colab-free-gpu-tutorial-e113627b9f5d
- Installation — imgaug 0.2.5 documentation. (2018). Imgaug.readthedocs.io. Retrieved 16 June 2018, from http://imgaug.readthedocs.io/en/latest/source/installation.html
- numpy.absolute — NumPy v1.14 Manual. (2018). Docs.scipy.org. Retrieved 16 June 2018, from https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.absolute.html
- numpy.clip — NumPy v1.10 Manual. (2018). Docs.scipy.org. Retrieved 16 June 2018, from https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.clip.html
- numpy.percentile — NumPy v1.14 Manual. (2018). Docs.scipy.org. Retrieved 16 June 2018, from https://docs.scipy.org/doc/numpy/reference/generated/numpy.percentile.html
- CIFAR-10 and CIFAR-100 datasets. (2018). Cs.toronto.edu. Retrieved 16 June 2018, from https://www.cs.toronto.edu/~kriz/cifar.html
- [ ICLR 2015 ] Striving for Simplicity: The All Convolutional Net with Interactive Code [ Manual…. (2018). Towards Data Science. Retrieved 16 June 2018, from https://towardsdatascience.com/iclr-2015-striving-for-simplicity-the-all-convolutional-net-with-interactive-code-manual-b4976e206760
- EN10/CIFAR. (2018). GitHub. Retrieved 16 June 2018, from https://github.com/EN10/CIFAR
- Axiomatic Attribution for Deep Networks — — Mukund Sundararajan, Ankur Taly, Qiqi Yan. (2017). Vimeo. Retrieved 16 June 2018, from https://vimeo.com/238242575
- (2018). Arxiv.org. Retrieved 16 June 2018, from https://arxiv.org/pdf/1506.06579.pdf
- Riemann sum. (2018). En.wikipedia.org. Retrieved 16 June 2018, from https://en.wikipedia.org/wiki/Riemann_sum
Bio: Jae Duk Seo is a fourth year computer scientist at Ryerson University.
Original. Reposted with permission.
Related:
- Building Convolutional Neural Network using NumPy from Scratch
- How I Used CNNs and Tensorflow and Lost a Silver Medal in Kaggle Challenge
- Using Tensorflow Object Detection to do Pixel Wise Classification