Top 10 Deep Learning Tips & Tricks
Deep Learning has been at the forefront of data science innovations throughout 2015. Dr. Arno Candel offers help through some valuable tips.
Dr. Arno Candel is Chief Architect at H2O.ai. He is considered one of the leading deep learning experts. He has over a decade of experience in high-performance computing. In past he has designed and implemented high-performance machine learning algorithms. Arno was named 2014 Big Data All-Star by Fortune Magazine.
I recently got to hear from him last month at H2O World 2015 which I would love to share with everyone. While most the tips are generic, there are some which are specific to H2O.
Here are top 10 deep learning tips and tricks as per Arno:
1. Understand Model Complexity
- Model size depends on features
- Model size is independent of number of rows or training time
2. Establish a Baseline on Holdout Data
- Develop a feel for the problem and the holdout performance of the different models
3. Inspect Models in Flow (Notebook-style open-source UI for H2O)
- If model is wrong(wrong architecture, response, parameters, etc.), cancel it
- If model is taking too long, cancel and decrease model complexity
- If model is performing badly, cancel and increase model complexity
4. Use Early Stopping (On by default for H2O)
- Saves tons of time
- Use Flow to inspect model
- Validation data determines scoring speed and early stopping decision
6. Use N-fold Cross-Validation
- Estimate your model performance well
7. Use Regularization
- Overfitting is easy, generalization is art
- Parameters to tune: hidden dropout ratios, input dropout ratios, adaptive rate, etc.
- Focus on just finding one of the many good models
9. Use Checkpointing
- Checkpointing enables fast exploration
10. Tune Communication on Multi-Node
- Know your "target ratio of communication overhead to computation" value
Original slide-deck mentioning above points.