- Comparing Linear and Logistic Regression - Nov 29, 2022.
Discussion on an entry-level data science interview question.
- An Introduction to SMOTE - Nov 29, 2022.
Improve the model performance by balancing the dataset using the synthetic minority oversampling technique.
- SHAP: Explain Any Machine Learning Model in Python - Nov 21, 2022.
A Comprehensive Guide to SHAP and Shapley Values
- Picking Examples to Understand Machine Learning Model - Nov 21, 2022.
Understanding ML by combining explainability and sample picking.
- How LinkedIn Uses Machine Learning To Rank Your Feed - Nov 14, 2022.
In this post, you will learn to clarify business problems & constraints, understand problem statements, select evaluation metrics, overcome technical challenges, and design high-level systems.
- Machine Learning from Scratch: Decision Trees - Nov 11, 2022.
A simple explanation and implementation of DTs ID3 algorithm in Python
- Getting Started with PyCaret - Nov 10, 2022.
An open-source low-code machine learning library for training and deploying the models in production.
- Confusion Matrix, Precision, and Recall Explained - Nov 9, 2022.
Learn these key machine learning performance metrics to ace data science interviews.
- The Most Comprehensive List of Kaggle Solutions and Ideas - Nov 8, 2022.
Learn from top-performing teams in the competition to get better at understanding machine learning techniques.
- 15 More Free Machine Learning and Deep Learning Books - Nov 7, 2022.
Check out this second list of 15 FREE ebooks for learning machine learning and deep learning.
- Simple and Fast Data Streaming for Machine Learning Projects - Nov 3, 2022.
Learn about the cutting-edge DagsHub's Direct Data Access for simple and faster data loading and model training.
- Random Forest vs Decision Tree: Key Differences - Nov 1, 2022.
Check out this reasoned comparison of 2 critical machine learning algorithms to help you better make an informed decision.
- The Gap Between Deep Learning and Human Cognitive Abilities - Oct 31, 2022.
How do we bridge this gap between deep learning and human cognitive ability?
- 15 Free Machine Learning and Deep Learning Books - Oct 31, 2022.
Check out this list of 15 FREE ebooks for learning machine learning and deep learning.
- Machine Learning on the Edge - Oct 27, 2022.
Edge ML involves putting ML models on consumer devices where they can independently run inferences without an internet connection, in real-time, and at no cost.
- The First ML Value Chain Landscape - Oct 24, 2022.
TheSequence recently released the first ever ML Chain Landscape shaped by data scientists, a new landscape that would be able to address the entire ML value chain.
- Ensemble Learning with Examples - Oct 24, 2022.
Learn various algorithms to improve the robustness and performance of machine learning applications. Furthermore, it will help you build a more generalized and stable model.
- Why TinyML Cases Are Becoming Popular? - Oct 20, 2022.
This article will provide an overview of what TinyML is, its use cases, and why it is becoming more popular.
- Frameworks for Approaching the Machine Learning Process - Oct 19, 2022.
This post is a summary of 2 distinct frameworks for approaching machine learning tasks, followed by a distilled third. Do they differ considerably (or at all) from each other, or from other such processes available?
- Working With Sparse Features In Machine Learning Models - Oct 17, 2022.
Sparse features can cause problems like overfitting and suboptimal results in learning models, and understanding why this happens is crucial when developing models. Multiple methods, including dimensionality reduction, are available to overcome issues due to sparse features.
- Implementing Adaboost in Scikit-learn - Oct 17, 2022.
It is called Adaptive Boosting due to the fact that the weights are re-assigned to each instance, with higher weights being assigned to instances that are not correctly classified - therefore it ‘adapts’.
- Mathematics for Machine Learning: The Free eBook - Oct 14, 2022.
Check out this free ebook covering the fundamentals of mathematics for machine learning, as well as its companion website of exercises and Jupyter notebooks.
- Classification Metrics Walkthrough: Logistic Regression with Accuracy, Precision, Recall, and ROC - Oct 13, 2022.
In this article, I will be going through 4 common classification metrics: Accuracy, Precision, Recall, and ROC in relation to Logistic Regression.
- 5 Free Courses to Master Linear Algebra - Oct 13, 2022.
Linear Algebra is an important subfield of mathematics and forms a core foundation of machine learning algorithms. The post shares five free courses to master the concepts of linear algebra.
- The Complete Free PyTorch Course for Deep Learning - Oct 12, 2022.
Do you want to learn PyTorch for machine learning and deep learning? Check out this 24 hour long video course with accompanying notes and courseware for free. Did I mention it's free?
- A Day in the Life of a Machine Learning Engineer - Oct 10, 2022.
What does a day in the life as a machine learning engineer look like for you?
- Hyperparameter Tuning Using Grid Search and Random Search in Python - Oct 5, 2022.
A comprehensive guide on optimizing model hyperparameters with Scikit-Learn.
- Machine Learning for Everybody! - Oct 4, 2022.
Who is machine learning for? Everybody!
- Which Metric Should I Use? Accuracy vs. AUC - Oct 4, 2022.
Depending on the problem you’re trying to solve, one metric may be more insightful than another.
- 7 Steps to Mastering Machine Learning with Python in 2022 - Sep 30, 2022.
Are you trying to teach yourself machine learning from scratch, but aren’t sure where to start? I will attempt to condense all the resources I’ve used over the years into 7 steps that you can follow to teach yourself machine learning.
- Master Transformers with This Free Stanford Course! - Sep 30, 2022.
If you want a deep dive on transformers, this Stanford course has made its courseware freely available, including lecture videos, readings, assignments, and more.
- Beyond Pipelines: Graphs as Scikit-Learn Metaestimators - Sep 29, 2022.
Create manageable and scalable machine learning workflows with skdag.
- Top 5 Machine Learning Practices Recommended by Experts - Sep 28, 2022.
This article is intended to help beginners improve their model structure by listing the best practices recommended by machine learning experts.
- How to Correctly Select a Sample From a Huge Dataset in Machine Learning - Sep 27, 2022.
We explain how choosing a small, representative dataset from a large population can improve model training reliability.
- More Performance Evaluation Metrics for Classification Problems You Should Know - Sep 20, 2022.
When building and optimizing your classification model, measuring how accurately it predicts your expected outcome is crucial. However, this metric alone is never the entire story, as it can still offer misleading results. That's where these additional performance evaluations come into play to help tease out more meaning from your model.
- AWS AI & ML Scholarship Program Overview - Sep 19, 2022.
This scholarship program aims to help people who are underserved and that were underrepresented during high school and college - to then help them learn the foundations and concepts of Machine Learning and build a careers in AI and ML.
- 7 Machine Learning Portfolio Projects to Boost the Resume - Sep 19, 2022.
Work on machine learning and deep learning portfolio projects to learn new skills and improve your chance of getting hired.
- 5 Concepts You Should Know About Gradient Descent and Cost Function - Sep 16, 2022.
Why is Gradient Descent so important in Machine Learning? Learn more about this iterative optimization algorithm and how it is used to minimize a loss function.
- An Intuitive Explanation of Collaborative Filtering - Sep 15, 2022.
The post introduces one of the most popular recommendation algorithms, i.e., collaborative filtering. It focuses on building an intuitive understanding of the algorithm illustrated with the help of an example.
- Everything You’ve Ever Wanted to Know About Machine Learning - Sep 9, 2022.
Putting the fun in fundamentals! A collection of short videos to amuse beginners and experts alike.
- Machine Learning Algorithms – What, Why, and How? - Sep 7, 2022.
This post explains why and when you need machine learning and concludes by listing the key considerations for choosing the correct machine learning algorithm.
- Choosing the Right Clustering Algorithm for Your Dataset - Sep 7, 2022.
Applying a clustering algorithm is much easier than selecting the best one. Each type offers pros and cons that must be considered if you’re striving for a tidy cluster structure.
- Visualizing Your Confusion Matrix in Scikit-learn - Sep 6, 2022.
Defining model evaluation metrics is crucial in ensuring that the model performs precisely for the purpose it is built. Confusion Matrix is one of the most popular and effective tools to evaluate the performance of the trained ML model. In this post, you will learn how to visualize the confusion matrix and interpret its output.
- Decision Tree Pruning: The Hows and Whys - Sep 2, 2022.
Decision trees are a machine learning algorithm that is susceptible to overfitting. One of the techniques you can use to reduce overfitting in decision trees is pruning.
- The Difference Between Training and Testing Data in Machine Learning - Aug 31, 2022.
When building a predictive model, the quality of the results depends on the data you use. In order to do so, you need to understand the difference between training and testing data in machine learning.
- Machine Learning Metadata Store - Aug 31, 2022.
In this article, we will learn about metadata stores, the need for them, their components, and metadata store management.
- A Complete Guide To Decision Tree Software - Aug 26, 2022.
Decision tree models are used to classify information into meaningful sequential results. Find out everything else you need to know here.
- How to Package and Distribute Machine Learning Models with MLFlow - Aug 25, 2022.
MLFlow is a tool to manage the end-to-end lifecycle of a Machine Learning model. Likewise, the installation and configuration of an MLFlow service is addressed and examples are added on how to generate and share projects with MLFlow.
- 7 Techniques to Handle Imbalanced Data - Aug 24, 2022.
This blog post introduces seven techniques that are commonly applied in domains like intrusion detection or real-time bidding, because the datasets are often extremely imbalanced.
- The Bias-Variance Trade-off - Aug 24, 2022.
Understanding how these prediction errors work and how they can be used will help you build models that are not only accurate and perform well - but also avoid overfitting and underfitting.
- Support Vector Machines: An Intuitive Approach - Aug 23, 2022.
This post focuses on building an intuition of the Support Vector Machine algorithm in a classification context and an in-depth understanding of how that graphical intuition can be mathematically represented in the form of a loss function. We will also discuss kernel tricks and a more useful variant of SVM with a soft margin.
- Tuning Random Forest Hyperparameters - Aug 22, 2022.
Hyperparameter tuning is important for algorithms. It improves their overall performance of a machine learning model and is set before the learning process and happens outside of the model.
- Implementing DBSCAN in Python - Aug 17, 2022.
Density-based clustering algorithm explained with scikit-learn code example.
- How to Avoid Overfitting - Aug 17, 2022.
Overfitting is when a statistical model fits exactly against its training data. This leads to the model failing to predict future observations accurately.
- Machine Learning Over Encrypted Data - Aug 16, 2022.
This blog outlines a solution to the Kaggle Titanic challenge that employs Privacy-Preserving Machine Learning (PPML) using the Concrete-ML open-source toolkit.
- What Does ETL Have to Do with Machine Learning? - Aug 15, 2022.
ETL during the process of producing effective machine learning algorithms is found at the base - the foundation. Let’s go through the steps on how ETL is important to machine learning.
- Data Transformation: Standardization vs Normalization - Aug 12, 2022.
Increasing accuracy in your models is often obtained through the first steps of data transformations. This guide explains the difference between the key feature scaling methods of standardization and normalization, and demonstrates when and how to apply each approach.
- Tuning XGBoost Hyperparameters - Aug 11, 2022.
Hyperparameter tuning is about finding a set of optimal hyperparameter values which maximizes the model's performance, minimizes loss, and produces better outputs.
- The Difference Between L1 and L2 Regularization - Aug 10, 2022.
Two types of regularized regression models are discussed here: Ridge Regression (L2 Regularization), and Lasso Regression (L1 Regularization)
- 6 Ways Businesses Can Benefit From Machine Learning - Aug 9, 2022.
Machine learning is gaining popularity rapidly in the business world. Discover the ways that your business can benefit from machine learning.
- How to Deal with Categorical Data for Machine Learning - Aug 4, 2022.
Check out this guide to implementing different types of encoding for categorical data, including a cheat sheet on when to use what type.
- What are the Assumptions of XGBoost? - Aug 4, 2022.
In this article, you will learn: how boosting relates to XGBoost; the features of XGBoost; how it reduces the loss function value and overfitting.
- Decision Trees vs Random Forests, Explained - Aug 2, 2022.
A simple, non-math heavy explanation of two popular tree-based machine learning models.
- How ML Model Explainability Accelerates the AI Adoption Journey for Financial Services - Jul 29, 2022.
Explainability and good model governance reduce risk and create the framework for ethical and transparent AI in financial services that eliminates bias.
- K-nearest Neighbors in Scikit-learn - Jul 28, 2022.
Learn about the k-nearest neighbours algorithm, one of the most prominent workhorse machine learning algorithms there is, and how to implement it using Scikit-learn in Python.
- Is Domain Knowledge Important for Machine Learning? - Jul 27, 2022.
If you incorporate domain knowledge into your architecture and your model, it can make it a lot easier to explain the results, both to yourself and to an outside viewer. Every bit of domain knowledge can serve as a stepping stone through the black box of a machine learning model.
- Detecting Data Drift for Ensuring Production ML Model Quality Using Eurybia - Jul 26, 2022.
This article will focus on a step-by-step data drift study using Eurybia an open-source python library
- Does the Random Forest Algorithm Need Normalization? - Jul 25, 2022.
Normalization is a good technique to use when your data consists of being scaled and your choice of machine learning algorithm does not have the ability to make assumptions on the distribution of your data.
- Using Scikit-learn’s Imputer - Jul 25, 2022.
Learn about Scikit-learn’s SimpleImputer, IterativeImputer, KNNImputer, and machine learning pipelines.
- Practical Deep Learning from fast.ai is Back! - Jul 25, 2022.
Looking for a great course to go from machine learning zero to hero quickly? fast.ai has released the latest version of Practical Deep Learning For Coders. And it won't cost you a thing.
- The Difficulty of Estimating the Carbon Footprint of Machine Learning - Jul 22, 2022.
Is machine learning killing the planet? Probably not, but let's make sure it doesn't.
- When Would Ensemble Techniques be a Good Choice? - Jul 18, 2022.
When would ensemble techniques be a good choice? When you want to improve the performance of machine learning models - it’s that simple.
- How Does Logistic Regression Work? - Jul 15, 2022.
Logistic regression is a machine learning classification algorithm that is used to predict the probability of certain classes based on some dependent variables
- Machine Learning Algorithms Explained in Less Than 1 Minute Each - Jul 13, 2022.
Learn about some of the most well known machine learning algorithms in less than a minute each.
- Why Use k-fold Cross Validation? - Jul 11, 2022.
Generalizing things is easy for us humans, however, it can be challenging for Machine Learning models. This is where Cross-Validation comes into the picture.
- Boosting Machine Learning Algorithms: An Overview - Jul 8, 2022.
The combination of several machine learning algorithms is referred to as ensemble learning. There are several ensemble learning techniques. In this article, we will focus on boosting.
- Ten Key Lessons of Implementing Recommendation Systems in Business - Jul 7, 2022.
We've been long working on improving the user experience in UGC products with machine learning. Following this article's advice, you will avoid a lot of mistakes when creating a recommendation system, and it will help to build a really good product.
- Linear Machine Learning Algorithms: An Overview - Jul 1, 2022.
In this article, we’ll discuss several linear algorithms and their concepts.
- Primary Supervised Learning Algorithms Used in Machine Learning - Jun 17, 2022.
In this tutorial, we are going to list some of the most common algorithms that are used in supervised learning along with a practical tutorial on such algorithms.
- Deep Learning Key Terms, Explained - Jun 13, 2022.
Gain a beginner's perspective on artificial neural networks and deep learning with this set of 14 straight-to-the-point related key concept definitions.
- A Structured Approach To Building a Machine Learning Model - Jun 10, 2022.
This article gives you a glimpse of how to approach a machine learning project with a clear outline of an easy-to-implement 5-step process.
- How is Data Mining Different from Machine Learning? - Jun 8, 2022.
How about we take a closer look at data mining and machine learning so we know how to catch their different ends?
- Genetic Algorithm Key Terms, Explained - Jun 6, 2022.
This article presents simple definitions for 12 genetic algorithm key terms, in order to help better introduce the concepts to newcomers.
- How Activation Functions Work in Deep Learning - Jun 3, 2022.
Check out a this article for a better understanding of activation functions.
- A Beginner’s Guide to Q Learning - Jun 3, 2022.
Learn the basics of Q-learning in this article, a model-free reinforcement learning algorithm.
- How to Become a Machine Learning Engineer - May 30, 2022.
A machine learning engineer is a programmer proficient in building and designing software to automate predictive models. They have a deeper focus on computer science, compared to data scientists.
- Weak Supervision Modeling, Explained - May 27, 2022.
This article dives into weak supervision modeling and truly understanding the label model.
- Operationalizing Machine Learning from PoC to Production - May 20, 2022.
Most companies haven’t seen ROI from machine learning since the benefit is only realized when the models are in production. Here’s how to make sure your ML project works.
- A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability - May 20, 2022.
We give a taxonomy of the trustworthy GNNs in privacy, robustness, fairness, and explainability. For each aspect, we categorize existing works into various categories, give general frameworks in each category, and more.
- HuggingFace Has Launched a Free Deep Reinforcement Learning Course - May 17, 2022.
Hugging Face has released a free course on Deep RL. It is self-paced and shares a lot of pointers on theory, tutorials, and hands-on guides.
- Popular Machine Learning Algorithms - May 16, 2022.
This guide will help aspiring data scientists and machine learning engineers gain better knowledge and experience. I will list different types of machine learning algorithms, which can be used with both Python and R.
- Reinforcement Learning for Newbies - May 16, 2022.
A simple guide to reinforcement learning for a complete beginner. The blog includes definitions with examples, real-life applications, key concepts, and various types of learning resources.
- Centroid Initialization Methods for k-means Clustering - May 13, 2022.
This article is the first in a series of articles looking at the different aspects of k-means clustering, beginning with a discussion on centroid initialization.
- The “Hello World” of Tensorflow - May 13, 2022.
In this article, we will build a beginner-friendly machine learning model using TensorFlow.
- Deep Learning For Compliance Checks: What’s New? - May 12, 2022.
By implementing the different NLP techniques into the production processes, compliance departments can maintain detailed checks and keep up with regulator demands.
- 5 Free Hosting Platform For Machine Learning Applications - May 12, 2022.
Learn about the free and easy-to-deploy hosting platform for your machine learning projects.
- Machine Learning’s Sweet Spot: Pure Approaches in NLP and Document Analysis - May 10, 2022.
While it is true that Machine Learning today isn’t ready for prime time in many business cases that revolve around Document Analysis, there are indeed scenarios where a pure ML approach can be considered.
- Machine Learning Key Terms, Explained - May 9, 2022.
Read this overview of 12 important machine learning concepts, presented in a no frills, straightforward definition style.
- Everything You Need to Know About Tensors - May 6, 2022.
In this article, we will cover the basics of the tensors.
- Image Classification with Convolutional Neural Networks (CNNs) - May 4, 2022.
In this article, we’ll look at what Convolutional Neural Networks are and how they work.
- Top 10 Machine Learning Demos: Hugging Face Spaces Edition - May 2, 2022.
Hugging Face Spaces allows you to have an interactive experience with the machine learning models, and we will be discovering the best application to get some inspiration.
- MLOps: The Best Practices and How To Apply Them - Apr 28, 2022.
Here are some of the best practices for implementing MLOps successfully.
- A Simple Guide to Machine Learning Visualisations - Apr 26, 2022.
Create simple, effective machine learning plots with Yellowbrick
- Optimizing Genes with a Genetic Algorithm - Apr 22, 2022.
In the simplest terms genetic algorithms simulate a population where each individual is a possible “solution” and let survival of the fittest do its thing.
- A Community for Synthetic Data is Here and This is Why We Need It - Apr 22, 2022.
The first open-source platform for synthetic data is here to help educate the broader machine learning and computer vision communities on the emerging technology.
- Machine Learning Books You Need To Read In 2022 - Apr 21, 2022.
I have a list of Machine Learning books you need to read in 2022; beginner, intermediate, expert, and for everybody.
- Deploy a Machine Learning Web App with Heroku - Apr 18, 2022.
In this article, you will learn to deploy a fully functional ML web application in under 3 minutes.
- Nearest Neighbors for Classification - Apr 12, 2022.
Learn about the K-Nearest Neighbors machine learning algorithm for classification.
- Naïve Bayes Algorithm: Everything You Need to Know - Apr 8, 2022.
Naïve Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks. In this article, we will understand the Naïve Bayes algorithm and all essential concepts so that there is no room for doubts in understanding.
- 4 Factors to Identify Machine Learning Solvable Problems - Apr 6, 2022.
The near future holds incredible possibility for machine learning to solve real world problems. But we need to be be able to determine which problems are solvable by ML and which are not.
- Logistic Regression for Classification - Apr 4, 2022.
Deep dive into Logistic Regression with practical examples.
- DBSCAN Clustering Algorithm in Machine Learning - Apr 4, 2022.
An introduction to the DBSCAN algorithm and its implementation in Python.
- What is an MLOps Engineer? - Apr 1, 2022.
And why you should consider becoming one.
- Machine Learning Pipeline Optimization with TPOT - Mar 31, 2022.
Let's revisit the automated machine learning project TPOT, and get back up to speed on using open source AutoML tools on our way to building a fully-automated prediction pipeline.
- Loss Functions: An Explainer - Mar 31, 2022.
A loss function measures how wrong the model is in terms of its ability to estimate the relationship between x and y. Find out about several common loss functions here.
- Time Series Forecasting with Ploomber, Arima, Python, and Slurm - Mar 29, 2022.
In this blog you will see how the authors took a raw .ipynb notebook that does time series forecasting with Arima, modularized it into a Ploomber pipeline, and ran parallel jobs on Slurm.
- MLOps Is a Mess But That’s to be Expected - Mar 25, 2022.
In this post, I want to focus the discussion about the state of machine learning operations (MLOps) today, where we are, where we are going.
- WTF is a Tensor?!? - Mar 24, 2022.
A tensor is a container which can house data in N dimensions, along with its linear operations, though there is nuance in what tensors technically are and what we refer to as tensors in practice.
- A New Way of Managing Deep Learning Datasets - Mar 23, 2022.
Create, version-control, query, and visualize image, audio, and video datasets using Hub 2.0 by Activeloop.
- Risk Management Framework for AI/ML Models - Mar 23, 2022.
How sound risk management acts as a catalyst to building successful AI/ML models.
- DIY Automated Machine Learning with Streamlit - Mar 22, 2022.
In this article, we will create an automated machine learning web app you can actually use.
- Linear vs Logistic Regression: A Succinct Explanation - Mar 21, 2022.
Linear Regression and Logistic Regression are two well-used Machine Learning Algorithms that both branch off from Supervised Learning. Linear Regression is used to solve Regression problems whereas Logistic Regression is used to solve Classification problems. Read more here.
- From Google Colab to a Ploomber Pipeline: ML at Scale with GPUs - Mar 17, 2022.
In this short blog, we’ll review the process of taking a POC data science pipeline (ML/Deep learning/NLP) that was conducted on Google Colab, and transforming it into a pipeline that can run parallel at scale and works with Git so the team can collaborate on.
- Machine Learning Algorithms for Classification - Mar 14, 2022.
In this article, we will be going through the algorithms that can be used for classification tasks.
- The Significance of Data Quality in Making a Successful Machine Learning Model - Mar 10, 2022.
Good quality data becomes imperative and a basic building block of an ML pipeline. The ML model can only be as good as its training data.
- How To Use Synthetic Data To Overcome Data Shortages For Machine Learning Model Training - Mar 9, 2022.
It takes time and considerable resources to collect, document, and clean data before it can be used. But there is a way to address this challenge – by using synthetic data.
- Building a Tractable, Feature Engineering Pipeline for Multivariate Time Series - Mar 8, 2022.
A time series feature engineering pipeline requires different transformations such as imputation and window aggregation, which follows a sequence of stages. This article demonstrates the building of a pipeline to derive multivariate time series features such that the features can then be easily tracked and validated.
- Build a Machine Learning Web App in 5 Minutes - Mar 7, 2022.
In this article, you will learn to export your models and use them outside a Jupyter Notebook environment. You will build a simple web application that is able to feed user input into a machine learning model, and display an output prediction to the user.
- 3 Reasons Why You Should Use Linear Regression Models Instead of Neural Networks - Mar 4, 2022.
While there may always seem to be something new, cool, and shiny in the field of AI/ML, classic statistical methods that leverage machine learning techniques remain powerful and practical for solving many real-world business problems.
- What is Adversarial Machine Learning? - Mar 3, 2022.
In the Cybersecurity sector Adversarial machine learning attempts to deceive and trick models by creating unique deceptive inputs, to confuse the model resulting in a malfunction in the model.