- 8 New Tools I Learned as a Data Scientist in 2020 - Jan 14, 2021.
The author shares the data science tools learned while making the move from Docker to Live Deployments.
- 5 Tools for Effortless Data Science - Jan 11, 2021.
The sixth tool is coffee.
- Model Experiments, Tracking and Registration using MLflow on Databricks - Jan 5, 2021.
This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.
- 10 Underrated Python Skills - Oct 21, 2020.
Tips for feature analysis, hyperparameter tuning, data visualization and more.
- Scaling the Wall Between Data Scientist and Data Engineer - Feb 17, 2020.
The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.
- Managing Machine Learning Cycles: Five Learnings from comparing Data Science Experimentation/ Collaboration Tools - Jan 29, 2020.
Machine learning projects require handling different versions of data, source code, hyperparameters, and environment configuration. Numerous tools are on the market for managing this variety, and this review features important lessons learned from an ongoing evaluation of the current landscape.
- [eBook] Standardizing the Machine Learning Lifecycle - Mar 15, 2019.
We explore what makes the machine learning lifecycle so challenging compared to regular software, and share the Databricks approach.
- GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows - Aug 8, 2018.
This post includes a wide spectrum of data science projects, all of which are open source and are present on GitHub repositories.
- Manage your Machine Learning Lifecycle with MLflow – Part 1 - Jul 5, 2018.
Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.