- Is Your Model Overtained? - Apr 14, 2021.
WeightWatcher is based on theoretical research (done injoint with UC Berkeley) into Why Deep Learning Works, based on our Theory of Heavy Tailed Self-Regularization (HT-SR). It uses ideas from Random Matrix Theory (RMT), Statistical Mechanics, and Strongly Correlated Systems.
- How to break a model in 20 days — a tutorial on production model analytics - Mar 29, 2021.
This is an article on how models fail in production, and how to spot it.
- Learning from machine learning mistakes - Mar 19, 2021.
Read this article and discover how to find weak spots of a regression model.
- KDnuggets™ News 21:n10, Mar 10: More Resources for Women in AI, Data Science, and Machine Learning; Speeding up Scikit-Learn Model Training - Mar 10, 2021.
More Resources for Women in AI, Data Science, and Machine Learning; Speeding up Scikit-Learn Model Training; Dask and Pandas: No Such Thing as Too Much Data; 9 Skills You Need to Become a Data Engineer; 8 Women in AI Who Are Striving to Humanize the World
- Evaluating Object Detection Models Using Mean Average Precision - Mar 3, 2021.
In this article we will see see how precision and recall are used to calculate the Mean Average Precision (mAP).
- My machine learning model does not learn. What should I do? - Feb 10, 2021.
This article presents 7 hints on how to get out of the quicksand.
- Backcasting: Building an Accurate Forecasting Model for Your Business - Feb 5, 2021.
This article will shed some light on processes happening under the roof of ML-based solutions on the example of the business case where the future success directly depends on the ability to predict unknown values from the past.
- Vision Transformers: Natural Language Processing (NLP) Increases Efficiency and Model Generality - Feb 2, 2021.
Why do we hear so little about transformer models applied to computer vision tasks? What about attention in computer vision networks?
- MLOps: Model Monitoring 101 - Jan 6, 2021.
Model monitoring using a model metric stack is essential to put a feedback loop from a deployed ML model back to the model building stage so that ML models can constantly improve themselves under different scenarios.
- Model Experiments, Tracking and Registration using MLflow on Databricks - Jan 5, 2021.
This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.
- Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance - Dec 21, 2020.
A practical deep dive on production monitoring architectures for machine learning at scale using real-time metrics, outlier detectors, drift detectors, metrics servers and explainers.
- Undersampling Will Change the Base Rates of Your Model’s Predictions - Dec 17, 2020.
In classification problems, the proportion of cases in each class largely determines the base rate of the predictions produced by the model. Therefore if you use sampling techniques that change this proportion, there is a good chance you will want to rescale / calibrate your predictions before using them in the wild.
- Pruning Machine Learning Models in TensorFlow - Dec 4, 2020.
Read this overview to learn how to make your models smaller via pruning.
- Deploying Trained Models to Production with TensorFlow Serving - Nov 30, 2020.
TensorFlow provides a way to move a trained model to a production environment for deployment with minimal effort. In this article, we’ll use a pre-trained model, save it, and serve it using TensorFlow Serving.
- Simple Python Package for Comparing, Plotting & Evaluating Regression Models - Nov 25, 2020.
This package is aimed to help users plot the evaluation metric graph with single line code for different widely used regression model metrics comparing them at a glance. With this utility package, it also significantly lowers the barrier for the practitioners to evaluate the different machine learning algorithms in an amateur fashion by applying it to their everyday predictive regression problems.
- How to Future-Proof Your Data Science Project - Nov 18, 2020.
This article outlines 5 critical elements of ML model selection & deployment.
- Building Deep Learning Projects with fastai — From Model Training to Deployment - Nov 4, 2020.
A getting started guide to develop computer vision application with fastai.
- Machine Learning Model Deployment - Sep 30, 2020.
Read this article on machine learning model deployment using serverless deployment. Serverless compute abstracts away provisioning, managing severs and configuring software, simplifying model deployment.
- The Insiders’ Guide to Generative and Discriminative Machine Learning Models - Sep 18, 2020.
In this article, we will look at the difference between generative and discriminative models, how they contrast, and one another.
- KDnuggets™ News 20:n34, Sep 9: Top Online Data Science Masters Degrees; Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills - Sep 9, 2020.
Also: Creating Powerful Animated Visualizations in Tableau; PyCaret 2.1 is here: What's new?; How To Decide What Data Skills To Learn; How to Evaluate the Performance of Your Machine Learning Model
- The NLP Model Forge: Generate Model Code On Demand - Aug 24, 2020.
You've seen their Big Bad NLP Database and The Super Duper NLP Repo. Now Quantum Stat is back with its most ambitious NLP product yet: The NLP Model Forge.
- Facebook Uses Bayesian Optimization to Conduct Better Experiments in Machine Learning Models - Aug 10, 2020.
A research from Facebook proposes a Beyasian optimization method to run A/B tests in machine learning models.
- Wrapping Machine Learning Techniques Within AI-JACK Library in R - Jul 17, 2020.
The article shows an approach to solving problem of selecting best technique in machine learning. This can be done in R using just one library called AI-JACK and the article shows how to use this tool.
- Stop training more models, start deploying them - Jun 30, 2020.
We are hardly living up to the promises of AI in healthcare. It’s not because of our training, it’s because of our deployment.
- A TensorFlow Modeling Pipeline Using TensorFlow Datasets and TensorBoard - Jun 23, 2020.
This article investigates TensorFlow components for building a toolset to make modeling evaluation more efficient. Specifically, TensorFlow Datasets (TFDS) and TensorBoard (TB) can be quite helpful in this task.
- How to make AI/Machine Learning models resilient during COVID-19 crisis - Jun 11, 2020.
COVID-19-driven concept shift has created concern over the usage of AI/ML to continue to drive business value following cases of inaccurate outputs and misleading results from a variety of fields. Data Science teams must invest effort in post-model tracking and management as well as deploy an agility in the AI/ML process to curb problems related to concept shift.
- Build and deploy your first machine learning web app - May 22, 2020.
A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret.
- Hyperparameter Optimization for Machine Learning Models - May 7, 2020.
Check out this comprehensive guide to model optimization techniques.
- KDnuggets™ News 20:n17, Apr 29: The Super Duper NLP Repo; Free Machine Learning & Data Science Books & Courses for Quarantine - Apr 29, 2020.
Also: Should Data Scientists Model COVID19 and other Biological Events; Learning during a crisis (Data Science 90-day learning challenge); Data Transformation: Standardization vs Normalization; DBSCAN Clustering Algorithm in Machine Learning; Find Your Perfect Fit: A Quick Guide for Job Roles in the Data World
- Announcing PyCaret 1.0.0 - Apr 21, 2020.
An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.
- The Double Descent Hypothesis: How Bigger Models and More Data Can Hurt Performance - Apr 20, 2020.
OpenAI research shows a phenomenon that challenges both traditional statistical learning theory and conventional wisdom in machine learning practitioners.
- Want to Build an AI Model for Your Business? Read this - Mar 25, 2020.
The best approach for AI production is similar to what venture capitalists (VC’s) do when they evaluate and invest in startups.
- ModelDB 2.0 is here! - Mar 19, 2020.
We are excited to announce that ModelDB 2.0 is now available! We have learned a lot since building ModelDB 1.0, so we decided to rebuild from the ground up.
- Decision Boundary for a Series of Machine Learning Models - Mar 13, 2020.
I train a series of Machine Learning models using the iris dataset, construct synthetic data from the extreme points within the data and test a number of Machine Learning models in order to draw the decision boundaries from which the models make predictions in a 2D space, which is useful for illustrative purposes and understanding on how different Machine Learning models make predictions.
- KDnuggets™ News 19:n45, Nov 27: Interpretable vs black box models; Advice for New and Junior Data Scientists - Nov 27, 2019.
This week: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead; Advice for New and Junior Data Scientists; Python Tuples and Tuple Methods; Can Neural Networks Develop Attention? Google Thinks they Can; Three Methods of Data Pre-Processing for Text Classification
- Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead - Nov 20, 2019.
The two main takeaways from this paper: firstly, a sharpening of my understanding of the difference between explainability and interpretability, and why the former may be problematic; and secondly some great pointers to techniques for creating truly interpretable models.
- Automate Hyperparameter Tuning for Your Models - Sep 20, 2019.
When we create our machine learning models, a common task that falls on us is how to tune them. So that brings us to the quintessential question: Can we automate this process?
- Version Control for Data Science: Tracking Machine Learning Models and Datasets - Sep 13, 2019.
I am a Git god, why do I need another version control system for Machine Learning Projects?
- Introducing AI Explainability 360: A New Toolkit to Help You Understand what Machine Learning Models are Doing - Aug 27, 2019.
Recently, AI researchers from IBM open sourced AI Explainability 360, a new toolkit of state-of-the-art algorithms that support the interpretability and explainability of machine learning models.
- 7 Tips for Dealing With Small Data - Jul 29, 2019.
At my workplace, we produce a lot of functional prototypes for our clients. Because of this, I often need to make Small Data go a long way. In this article, I’ll share 7 tips to improve your results when prototyping with small datasets.
- From Data Pre-processing to Optimizing a Regression Model Performance - Jul 19, 2019.
All you need to know about data pre-processing, and how to build and optimize a regression model using Backward Elimination method in Python.
- Data-driven to Model-driven: The Strategic Shift Being Made by Leading Organizations - Jun 17, 2019.
You can have all the data you want, do all the machine learning you want, but if you aren’t running your business on models, you’ll soon be left behind. In this webinar, we will demystify the model-driven business.
- The Machine Learning Puzzle, Explained - Jun 17, 2019.
Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together.
- All Models Are Wrong – What Does It Mean? - Jun 12, 2019.
During your adventures in data science, you may have heard “all models are wrong.” Let’s unpack this famous quote to understand how we can still make models that are useful.
- 7 Steps to Mastering Intermediate Machine Learning with Python — 2019 Edition - Jun 3, 2019.
This is the second part of this new learning path series for mastering machine learning with Python. Check out these 7 steps to help master intermediate machine learning with Python!
- Choosing Between Model Candidates - May 29, 2019.
Models are useful because they allow us to generalize from one situation to another. When we use a model, we’re working under the assumption that there is some underlying pattern we want to measure, but it has some error on top of it.
- Careful! Looking at your model results too much can cause information leakage - May 24, 2019.
We all are aware of the issue of overfitting, which is essentially where the model you build replicates the training data results so perfectly its fitted to the training data and does not generalise to better represent the population the data comes to, with catastrophic results when you feed in new data and get very odd results.
- Modeling 101 - May 13, 2019.
In the past couple of decades, innovation in statistics and machine learning has been increasing at a rapid pace and we are now able to do things unimaginable when I began my career.
- Modeling Price with Regularized Linear Model & XGBoost - May 2, 2019.
We are going to implement regularization techniques for linear regression of house pricing data. Our goal in price modeling is to model the pattern and ignore the noise.
- Distributed Artificial Intelligence: A primer on Multi-Agent Systems, Agent-Based Modeling, and Swarm Intelligence - Apr 18, 2019.
Distributed Artificial Intelligence (DAI) is a class of technologies and methods that span from swarm intelligence to multi-agent technologies. It is one of the subsets of AI where simulation has greater importance that point-prediction.
- Checklist for Debugging Neural Networks - Mar 22, 2019.
Check out these tangible steps you can take to identify and fix issues with training, generalization, and optimization for machine learning models.
- KDnuggets™ News 19:n05, Jan 30: Your AI skills are worth less than you think; 7 Steps to Mastering Basic Machine Learning - Jan 30, 2019.
Also: Logistic Regression: A Concise Technical Overview; AI is a Big Fat Lie; How To Fine Tune Your Machine Learning Models To Improve Forecasting Accuracy; Airbnb Rental Listings Dataset Mining; Data Science Project Flow for Startups
- A startup: Quantitative Modeler [Remote, US] - Jan 18, 2019.
A Startup is seeking a talented and highly motivated Quantitative Modeler for a unique and exciting opportunity with a small team looking to accurately predict the events (team and player level) of future sports competitions.
- BERT: State of the Art NLP Model, Explained - Dec 26, 2018.
BERT’s key technical innovation is applying the bidirectional training of Transformer, a popular attention model, to language modelling. It has caused a stir in the Machine Learning community by presenting state-of-the-art results in a wide variety of NLP tasks.
- Multi-Class Text Classification Model Comparison and Selection - Nov 1, 2018.
This is what we are going to do today: use everything that we have presented about text classification in the previous articles (and more) and comparing between the text classification models we trained in order to choose the most accurate one for our problem.
Pages: 1 2
- Will Models Rule the World? Data Science Salon Miami, Nov 6-7 - Oct 19, 2018.
This post is excerpted from the thoughts of Data Science Salon Miami speakers on the future of model-based decision-making.
- How to Solve the ModelOps Challenge - Oct 18, 2018.
A recent study shows that while 85% believe data science will allow their companies to obtain or sustain a competitive advantage, only 5% are using data science extensively. Join this webinar, Nov 14, to find out why.
- Society of Machines: The Complex Interaction of Agents - Oct 4, 2018.
In this new series we’ll focus on collective interaction of two or more machines. This interaction of machines can be among each other or with the environment.
- Machine Learning: How to Build a Model From Scratch - Sep 20, 2018.
Register now for upcoming webinar, Building a Machine Learning Fraud Model with Momentum Travel, on Sep 27 @ 10 AM PT.
- How to Make Your Machine Learning Models Robust to Outliers - Aug 28, 2018.
In this blog, we’ll try to understand the different interpretations of this “distant” notion. We will also look into the outlier detection and treatment techniques while seeing their impact on different types of machine learning models.
- Leveraging Agent-based Models (ABM) and Digital Twins to Prevent Injuries - Aug 22, 2018.
Both athletes and machines deal with inter-twined complex systems (where the interactions of one complex system can have a ripple effect on others) that can have significant impact on their operational effectiveness.
- Building Reliable Machine Learning Models with Cross-validation - Aug 9, 2018.
Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess how the results of a model will generalize to an independent data set in practice.
- Modelling Time Series Processes using GARCH - May 25, 2018.
To go into the turbulent seas of volatile data and analyze it in a time changing setting, ARCH models were developed.
Pages: 1 2
- Model Risk Management with Automated Machine Learning, Mar 29 Webinar - Mar 9, 2018.
Model Risk Management has recently become a very hot topic in regulatory and compliance-rich industries. Join DataRobot on Mar 29, 2018 for a webinar titled "Model Risk Management with Automated Machine Learning."
- A Framework for Approaching Textual Data Science Tasks - Nov 22, 2017.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
- Interpreting Machine Learning Models: An Overview - Nov 7, 2017.
This post summarizes the contents of a recent O'Reilly article outlining a number of methods for interpreting machine learning models, beyond the usual go-to measures.
- Train your Deep Learning Faster: FreezeOut - Aug 3, 2017.
We explain another novel method for much faster training of Deep Learning models by freezing the intermediate layers, and show that it has little or no effect on accuracy.
- Models: From the Lab to the Factory - Apr 27, 2017.
In this post, we’ll go over techniques to avoid these scenarios through the process of model management and deployment.
- Must-Know: Why it may be better to have fewer predictors in Machine Learning models? - Apr 4, 2017.
There are a few reasons why it might be a better idea to have fewer predictor variables rather than having many of them. Read on to find out more.
- What is Structural Equation Modeling? - Mar 27, 2017.
Structural Equation Modeling (SEM) is an extremely broad and flexible framework for data analysis, perhaps better thought of as a family of related methods rather than as a single technique. What is its relevance to Marketing Research?
- Statistical Modeling: A Primer - Mar 21, 2017.
It's critical to understand that statistical models are simplified representations of reality and they're all wrong but some of them are useful. So why do we use statistical models?
- Learn about modeling methods at PAW Chicago, Jun 19-22 - Feb 27, 2017.
Predictive Analytics World Business is coming to Chicago Jun 19-22, featuring advanced predictive modeling methods. Register by March 10 for super early bird rates.
- Smart Data Platform – The Future of Big Data Technology - Dec 2, 2016.
Data processing and analytical modelling are major bottlenecks in today’s big data world, due to need of human intelligence to decide relationships between data, required data engineering tasks, analytical models and it’s parameters. This article talks about Smart Data Platform to help to solve such problems.
- Approaching (Almost) Any Machine Learning Problem - Aug 18, 2016.
If you're looking for an overview of how to approach (almost) any machine learning problem, this is a good place to start. Read on as a Kaggle competition veteran shares his pipelines and approach to problem-solving.
Pages: 1 2
- New Standard Methodology for Analytical Models - Aug 3, 2015.
Traditional methods for the analytical modelling like CRISP-DM have several shortcomings. Here we describe these friction points in CRISP-DM and introduce a new approach of Standard Methodology for Analytics Models which overcomes them.
Pages: 1 2 3
- Automatic Statistician and the Profoundly Desired Automation for Data Science - Feb 17, 2015.
The Automatic Statistician project by Univ. of Cambridge and MIT is pushing ahead the frontiers of automation for the selection and evaluation of machine learning models. In general, what does automation mean to Data Science?
- Boeing: Advanced Technologist – Modeling - Nov 25, 2014.
Modeling, using formal languages for system specification, requirements formalization, application of formal methods to safety analysis, vehicle health management, and test-case generation.