# Algorithms (206)

**Difference between distributed learning versus federated learning algorithms**- Nov 19, 2021.

Want to know the difference between distributed and federated learning? Read this article to find out.**Machine Learning Model Development and Model Operations: Principles and Practices**- Oct 27, 2021.

The ML model management and the delivery of highly performing model is as important as the initial build of the model by choosing right dataset. The concepts around model retraining, model versioning, model deployment and model monitoring are the basis for machine learning operations (MLOps) that helps the data science teams deliver highly performing models.**How our Obsession with Algorithms Broke Computer Vision: And how Synthetic Computer Vision can fix it**- Oct 15, 2021.

Deep Learning radically improved Machine Learning as a whole. The Data-Centric revolution is about to do the same. In this post, we’ll take a look at the pitfalls of mainstream Computer Vision (CV) and discuss why Synthetic Computer Vision (SCV) is the future.**WHT: A Simpler Version of the fast Fourier Transform (FFT) you should know**- Jul 21, 2021.

The fast Walsh Hadamard transform is a simple and useful algorithm for machine learning that was popular in the 1960s and early 1970s. This useful approach should be more widely appreciated and applied for its efficiency.**Ethics, Fairness, and Bias in AI**- Jun 30, 2021.

As more AI-enhanced applications seep into our daily lives and expand their reach to larger swaths of populations around the world, we must clearly understand the vulnerabilities trained machine leaning models can exhibit based on the data used during development. Such issues can negatively impact select groups of people, so addressing the ethical decisions made by AI--possibly unknowingly--is important to the long-term fairness and success of this new technology.**DeepMind Wants to Reimagine One of the Most Important Algorithms in Machine Learning**- May 14, 2021.

In one of the most important papers this year, DeepMind proposed a multi-agent structure to redefine PCA.**Ensemble Methods Explained in Plain English: Bagging**- May 10, 2021.

Understand the intuition behind bagging with examples in Python.**XGBoost Explained: DIY XGBoost Library in Less Than 200 Lines of Python**- May 3, 2021.

Understand how XGBoost work with a simple 200 lines codes that implement gradient boosting for decision trees.**KDnuggets™ News 21:n16, Apr 28: Data Science Books You Should Start Reading in 2021; Top 10 Must-Know Machine Learning Algorithms for Data Scientists**- Apr 28, 2021.

Data science is not about data – applying Dijkstra principle to data science; Data Science Books You Should Start Reading in 2021; How to ace A/B Testing Data Science Interviews; Top 10 Must-Know Machine Learning Algorithms for Data Scientists – Part 1; Production-Ready Machine Learning NLP API with FastAPI and spaCy**Top 10 Must-Know Machine Learning Algorithms for Data Scientists – Part 1**- Apr 22, 2021.

New to data science? Interested in the must-know machine learning algorithms in the field? Check out the first part of our list and introductory descriptions of the top 10 algorithms for data scientists to know.**Beautiful decision tree visualizations with dtreeviz**- Mar 8, 2021.

Improve the old way of plotting the decision trees and never go back!**Machine Learning – it’s all about assumptions**- Feb 11, 2021.

Just as with most things in life, assumptions can directly lead to success or failure. Similarly in machine learning, appreciating the assumed logic behind machine learning techniques will guide you toward applying the best tool for the data.**My machine learning model does not learn. What should I do?**- Feb 10, 2021.

This article presents 7 hints on how to get out of the quicksand.**K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines**- Jan 15, 2021.

K-means clustering is a powerful algorithm for similarity searches, and Facebook AI Research's faiss library is turning out to be a speed champion. With only a handful of lines of code shared in this demonstration, faiss outperforms the implementation in scikit-learn in speed and accuracy.**KDnuggets™ News 21:n01, Jan 6: All machine learning algorithms you should know in 2021; Monte Carlo integration in Python; MuZero – the most important ML system ever created?**- Jan 6, 2021.

The first issue in 2021 brings you a great blog about Monte Carlo Integration - in Python; An overview of main Machine Learning algorithms you need to know in 2021; SQL vs NoSQL: 7 Key Takeaways; Generating Beautiful Neural Network Visualizations - how to; MuZero - may be the most important Machine Learning system ever created; and much more!**All Machine Learning Algorithms You Should Know in 2021**- Jan 4, 2021.

Many machine learning algorithms exits that range from simple to complex in their approach, and together provide a powerful library of tools for analyzing and predicting patterns from data. If you are learning for the first time or reviewing techniques, then these intuitive explanations of the most popular machine learning models will help you kick off the new year with confidence.**Key Data Science Algorithms Explained: From k-means to k-medoids clustering**- Dec 29, 2020.

As a core method in the Data Scientist's toolbox, k-means clustering is valuable but can be limited based on the structure of the data. Can expanded methods like PAM (partitioning around medoids), CLARA, and CLARANS provide better solutions, and what is the future of these algorithms?**XGBoost: What it is, and when to use it**- Dec 23, 2020.

XGBoost is a tree based ensemble machine learning algorithm which is a scalable machine learning system for tree boosting. Read more for an overview of the parameters that make it work, and when you would use the algorithm.**Navigate the road to Responsible AI**- Dec 18, 2020.

Deploying AI ethically and responsibly will involve cross-functional team collaboration, new tools and processes, and proper support from key stakeholders.**Implementing the AdaBoost Algorithm From Scratch**- Dec 10, 2020.

AdaBoost technique follows a decision tree model with a depth equal to one. AdaBoost is nothing but the forest of stumps rather than trees. AdaBoost works by putting more weight on difficult to classify instances and less on those already handled well. AdaBoost algorithm is developed to solve both classification and regression problem. Learn to build the algorithm from scratch here.**How to Know if a Neural Network is Right for Your Machine Learning Initiative**- Nov 26, 2020.

It is important to remember that there must be a business reason for even considering neural nets and it should not be because the C-Suite is feeling a bad case of FOMO.**Know-How to Learn Machine Learning Algorithms Effectively**- Nov 23, 2020.

The takeaway from the story is that machine learning is way beyond a simple fit and predict methods. The author shares their approach to actually learning these algorithms beyond the surface.**Cellular Automata in Stream Learning**- Nov 20, 2020.

In this post, we will start presenting CA as pattern recognition methods for stream learning. Finally, we will briefly mention two recent CA-based solutions for stream learning. Both are highly interpretable as their cellular structure represents directly the mapping between the feature space and the labels to be predicted.**How to Acquire the Most Wanted Data Science Skills**- Nov 13, 2020.

We recently surveyed KDnuggets readers to determine the "most wanted" data science skills. Since they seem to be those most in demand from practitioners, here is a collection of resources for getting started with this learning.**Doing the impossible? Machine learning with less than one example**- Nov 9, 2020.

Machine learning algorithms are notoriously known for needing data, a lot of data -- the more data the better. But, much research has gone into developing new methods that need fewer examples to train a model, such as "few-shot" or "one-shot" learning that require only a handful or a few as one example for effective learning. Now, this lower boundary on training examples is being taken to the next extreme.**Exploring the Significance of Machine Learning for Algorithmic Trading with Stefan Jansen**- Oct 28, 2020.

The immense expansion of digital data has increased the demand for proficiency in trading strategies that use machine learning (ML). Learn more from author Stefan Jansen, and get his latest book on the subject from Packt Publishing.**How to Explain Key Machine Learning Algorithms at an Interview**- Oct 19, 2020.

While preparing for interviews in Data Science, it is essential to clearly understand a range of machine learning models -- with a concise explanation for each at the ready. Here, we summarize various machine learning models by highlighting the main points to help you communicate complex models.**Exploring The Brute Force K-Nearest Neighbors Algorithm**- Oct 12, 2020.

This article discusses a simple approach to increasing the accuracy of k-nearest neighbors models in a particular subset of cases.**Algorithms of Social Manipulation**- Oct 9, 2020.

As we all continuously interact with each other and our favorite businesses through apps and websites, the level at which we are being tracked and monitored is significant. While the technologies behind these capabilities provide us value, the tech companies can also influence our decisions on where to click, spend our money, and much more.**The List of Top 10 Lists in Data Science**- Aug 14, 2020.

The list of Top 10 lists that Data Scientists -- from enthusiasts to those who want to jump start a career -- must know to smoothly navigate a path through this field.**Data Mining and Machine Learning: Fundamental Concepts and Algorithms: The Free eBook**- Jul 21, 2020.

The second edition of Data Mining and Machine Learning: Fundamental Concepts and Algorithms is available to read freely online, and includes a new part on regression with chapters on linear regression, logistic regression, neural networks, deep learning and regression assessment.**Time Complexity: How to measure the efficiency of algorithms**- Jun 24, 2020.

When we consider the complexity of an algorithm, we shouldn’t really care about the exact number of operations that are performed; instead, we should care about how the number of operations relates to the problem size.**Understanding Machine Learning: The Free eBook**- Jun 15, 2020.

Time to get back to basics. This week we have a look at a book on foundational machine learning concepts, Understanding Machine Learning: From Theory to Algorithms.**KDnuggets™ News 20:n21, May 27: The Best NLP with Deep Learning Course is Free; Your First Machine Learning Web App**- May 27, 2020.

Also: Python For Everybody: The Free eBook; Complex logic at breakneck speed: Try Julia for data science; An easy guide to choose the right Machine Learning algorithm; Dataset Splitting Best Practices in Python; Appropriately Handling Missing Values for Statistical Modelling and Prediction**Python For Everybody: The Free eBook**- May 25, 2020.

Get back to fundamentals with this free eBook, Python For Everybody, approaching the learning of programming from a data analysis perspective.**Visualizing Decision Trees with Python (Scikit-learn, Graphviz, Matplotlib)**- Apr 15, 2020.

Learn about how to visualize decision trees using matplotlib and Graphviz.**Introduction to the K-nearest Neighbour Algorithm Using Examples**- Apr 1, 2020.

Read this concise summary of KNN, a supervised and pattern classification learning algorithm which helps us find which class the new input belongs to when k nearest neighbours are chosen and distance is calculated between them.**Deep Learning Breakthrough: a sub-linear deep learning algorithm that does not need a GPU?**- Mar 26, 2020.

Deep Learning sits at the forefront of many important advances underway in machine learning. With backpropagation being a primary training method, its computational inefficiencies require sophisticated hardware, such as GPUs. Learn about this recent breakthrough algorithmic advancement with improvements to the backpropgation calculations on a CPU that outperforms large neural network training with a GPU.**Making sense of ensemble learning techniques**- Mar 26, 2020.

This article breaks down ensemble learning and how it can be used for problem solving.**A Top Machine Learning Algorithm Explained: Support Vector Machines (SVM)**- Mar 18, 2020.

Support Vector Machines (SVMs) are powerful for solving regression and classification problems. You should have this approach in your machine learning arsenal, and this article provides all the mathematics you need to know -- it's not as hard you might think.**KDnuggets™ News 20:n08, Feb 26: Gartner 2020 Magic Quadrant for Data Science & Machine Learning Platforms; Will AutoML Replace Data Scientists?**- Feb 26, 2020.

This week in KDnuggets: The Death of Data Scientists - will AutoML replace them?; Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms; Hand labeling is the past. The future is #NoLabel AI; The Forgotten Algorithm; Getting Started with R Programming; and much, much more.**How to Get Started With Algorithmic Finance**- Jan 23, 2020.

Algorithmic finance has been around for decades as a money-making tool, and it's not magic. Learn about some practical strategies along with and introduction to code you can use to get started.**Random Forest® — A Powerful Ensemble Learning Algorithm**- Jan 22, 2020.

The article explains the Random Forest algorithm and how to build and optimize a Random Forest classifier.**Microsoft Introduces Project Petridish to Find the Best Neural Network for Your Problem**- Jan 20, 2020.

The new algorithm takes a novel approach to neural architecture search.**Handling Trees in Data Science Algorithmic Interview**- Jan 16, 2020.

This post is about fast-tracking the study and explanation of tree concepts for the data scientists so that you breeze through the next time you get asked these in an interview.**Classify A Rare Event Using 5 Machine Learning Algorithms**- Jan 15, 2020.

Which algorithm works best for unbalanced data? Are there any tradeoffs?**5 Ways to Apply Ethics to AI**- Dec 19, 2019.

Here are six more lessons based on real life examples that I think we should all remember as people working in machine learning, whether you’re a researcher, engineer, or a decision-maker.**Dusting Under the Bed: Machine Learners’ Responsibility for the Future of Our Society**- Dec 13, 2019.

The Machine Learning community must shape the world so that AI is built and implemented with a focus on the entire outcome for our society, and not just optimized for accuracy and/or profit.**KDnuggets™ News 19:n36, Sep 25: The Hidden Risk of AI and Big Data; The 5 Sampling Algorithms every Data Scientist needs to know**- Sep 25, 2019.

Learn about unexpected risk of AI applied to Big Data; Study 5 Sampling Algorithms every Data Scientist needs to know; Read how one data scientist copes with his boring days of deploying machine learning; 5 beginner-friendly steps to learn ML with Python; and more.**The 5 Sampling Algorithms every Data Scientist need to know**- Sep 18, 2019.

Algorithms are at the core of data science and sampling is a critical technical that can make or break a project. Learn more about the most common sampling techniques used, so you can select the best approach while working with your data.**There is No Free Lunch in Data Science**- Sep 12, 2019.

There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.**A Friendly Introduction to Support Vector Machines**- Sep 12, 2019.

This article explains the Support Vector Machines (SVM) algorithm in an easy way.**The 5 Graph Algorithms That Data Scientists Should Know**- Sep 10, 2019.

In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python.**Top KDnuggets tweets, Aug 21-27: Algorithms Notes for Professionals – Free Book**- Aug 28, 2019.

Algorithms Notes for Professionals - Free Book; 10 simple Linux tips which save 50% of my time in the command line; Why so many #DataScientists are leaving their jobs; Order Matters: Alibaba Transformer-based Recommender System**How to count Big Data: Probabilistic data structures and algorithms**- Aug 26, 2019.

Learn how probabilistic data structures and algorithms can be used for cardinality estimation in Big Data streams.**Automate Stacking In Python: How to Boost Your Performance While Saving Time**- Aug 21, 2019.

Utilizing stacking (stacked generalizations) is a very hot topic when it comes to pushing your machine learning algorithm to new heights. For instance, most if not all winning Kaggle submissions nowadays make use of some form of stacking or a variation of it.**Coding Random Forests® in 100 lines of code***- Aug 7, 2019.

There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.**An Overview of Outlier Detection Methods from PyOD – Part 1**- Jun 27, 2019.

PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.**10 Gradient Descent Optimisation Algorithms + Cheat Sheet**- Jun 26, 2019.

Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.**The Machine Learning Puzzle, Explained**- Jun 17, 2019.

Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together.**Think Like an Amateur, Do As an Expert: Lessons from a Career in Computer Vision**- May 17, 2019.

Dr. Takeo Kanade shared his life lessons from an illustrious 50-year career in Computer Vision at last year's Embedded Vision Summit. You have a chance to attend the 2019 Embedded Vision Summit, from May 20-23, in the Santa Clara Convention Center, Santa Clara CA.**Naive Bayes: A Baseline Model for Machine Learning Classification Performance**- May 7, 2019.

We can use Pandas to conduct Bayes Theorem and Scikitlearn to implement the Naive Bayes Algorithm. We take a step by step approach to understand Bayes and implementing the different options in Scikitlearn.**KDnuggets™ News 19:n17, May 1: The most desired skill in data science; Seeking KDnuggets Editors, work remotely**- May 1, 2019.

This week, find out about the most desired skill in data science, learn which projects to include in your portfolio, identify a single strategy for pulling data from a Pandas DataFrame (once and for all), read the results of our Top Data Science and Machine Learning Methods poll, and much more.**Top Data Science and Machine Learning Methods Used in 2018, 2019**- Apr 29, 2019.

Once again, the most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests. The greatest relative increases this year are overwhelmingly Deep Learning techniques, while SVD, SVMs and Association Rules show the greatest decline.**How Machines Make Sense of Big Data: an Introduction to Clustering Algorithms**- Apr 16, 2019.

We outline three different clustering algorithms - k-means clustering, hierarchical clustering and Graph Community Detection - providing an explanation on when to use each, how they work and a worked example.**Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application?**- Apr 9, 2019.

Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application? Take part in the latest KDnuggets survey and have your say.**Artificial Neural Networks Optimization using Genetic Algorithm with Python**- Mar 18, 2019.

This tutorial explains the usage of the genetic algorithm for optimizing the network weights of an Artificial Neural Network for improved performance.**Designing Ethical Algorithms**- Mar 8, 2019.

Ethical algorithm design is becoming a hot topic as machine learning becomes more widespread. But how do you make an algorithm ethical? Here are 5 suggestions to consider.**The Algorithms Aren’t Biased, We Are**- Jan 29, 2019.

We explain the concept of bias and how it can appear in your projects, share some illustrative examples, and translate the latest academic research on “algorithmic bias.”**Data Science and Ethics – Why Companies Need a new CEO (Chief Ethics Officer)**- Jan 21, 2019.

We explain why data science companies need to have a Chief Ethics Officer and discuss their importance in tackling algorithm bias.**A Guide to Decision Trees for Machine Learning and Data Science**- Dec 24, 2018.

What makes decision trees special in the realm of ML models is really their clarity of information representation. The “knowledge” learned by a decision tree through training is directly formulated into a hierarchical structure.**10 More Must-See Free Courses for Machine Learning and Data Science**- Dec 20, 2018.

Have a look at this follow-up collection of free machine learning and data science courses to give you some winter study ideas.**Machine learning — Is the emperor wearing clothes?**- Oct 12, 2018.

We take a look at the core concepts of Machine Learning, including the data, algorithm and optimization needed to get you started, with links to additional resources to help enhance your knowledge.**KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild**- Oct 10, 2018.

This week, KDnuggets brings you a discussion of learning algorithms with a hat tip to Tom Mitchell, discusses why you might call yourself a data scientist, explores machine learning in the wild, checks out some top trends in deep learning, shows you how to learn data science if you are low on finances, and puts forth one person's opinion on the top 8 Python machine learning libraries to help get the job done.**A Concise Explanation of Learning Algorithms with the Mitchell Paradigm**- Oct 5, 2018.

A single quote from Tom Mitchell can shed light on both the abstract concept and concrete implementations of machine learning algorithms.**Linear Regression in the Wild**- Oct 3, 2018.

We take a look at how to use linear regression when the dependent variables have measurement errors.**What If the Data Tells You to Be Racist? When Algorithms Explicitly Penalize**- Sep 26, 2018.

Without the right precautions, machine learning — the technology that drives risk-assessment in law enforcement, as well as hiring and loan decisions — explicitly penalizes underprivileged groups.**KDnuggets™ News 18:n36, Sep 26: Machine Learning Algorithms From Scratch; Deep Learning Framework Popularity; Data Capture, the Deep Learning Way**- Sep 26, 2018.

Also: SQL Case Study: Helping a Startup CEO Manage His Data; Building a Machine Learning Model through Trial and Error; The Whys and Hows of Web Scraping; Unfolding Naive Bayes From Scratch; "Auto-What?" - A Taxonomy of Automated Machine Learning**Selecting the Best Machine Learning Algorithm for Your Regression Problem**- Aug 1, 2018.

This post should then serve as a great aid in selecting the best ML algorithm for you regression problem!**Weapons of Math Destruction, Ethical Matrix, Nate Silver and more Highlights from the Data Science Leaders Summit**- Jul 31, 2018.

Domino Data Lab hosted its first ever Data Science Leaders Summit at the lovely Yerba Buena Center for the Arts in San Francisco on May 30-31, 2018. Cathy O'Neil, Nate Silver, Cassie Kozyrkov and Eric Colson were some of the speakers at this event.**Genetic Algorithm Implementation in Python**- Jul 24, 2018.

This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation.**Clustering Using K-means Algorithm**- Jul 18, 2018.

This article explains K-means algorithm in an easy way. I’d like to start with an example to understand the objective of this powerful technique in machine learning before getting into the algorithm, which is quite simple.**Deep Learning and Challenges of Scale Webinar**- Jul 9, 2018.

Join Nvidia for an on-demand webinar to learn how to tackle the challenges of scaling and building complex deep learning systems.**KDnuggets™ News 18:n16, Apr 18: Key Algorithms and Statistical Models; Don’t learn Machine Learning in 24 hours; Data Scientist among the best US Jobs in 2018**- Apr 18, 2018.

Also: Top 10 Technology Trends of 2018; 12 Useful Things to Know About Machine Learning; Robust Word2Vec Models with Gensim & Applying Word2Vec Features for Machine Learning Tasks; Understanding What is Behind Sentiment Analysis - Part 1; Getting Started with PyTorch**Key Algorithms and Statistical Models for Aspiring Data Scientists**- Apr 16, 2018.

This article provides a summary of key algorithms and statistical techniques commonly used in industry, along with a short resource related to these techniques.**Ten Machine Learning Algorithms You Should Know to Become a Data Scientist**- Apr 11, 2018.

It's important for data scientists to have a broad range of knowledge, keeping themselves updated with the latest trends. With that being said, we take a look at the top 10 machine learning algorithms every data scientist should know.**Top 20 Deep Learning Papers, 2018 Edition**- Apr 3, 2018.

Deep Learning is constantly evolving at a fast pace. New techniques, tools and implementations are changing the field of Machine Learning and bringing excellent results.**Multiscale Methods and Machine Learning**- Mar 19, 2018.

We highlight recent developments in machine learning and Deep Learning related to multiscale methods, which analyze data at a variety of scales to capture a wider range of relevant features. We give a general overview of multiscale methods, examine recent successes, and compare with similar approaches.**KDnuggets™ News 18:n09, Feb 28: Gartner 2018 MQ for Data Science/ML – Gainers and Losers; Comparative Analysis of Top 6 BI/Data Viz Tools**- Feb 28, 2018.

A Comparative Analysis of Top 6 BI and Data Visualization Tools; A Tour of The Top 10 Algorithms for Machine Learning Newbies; A Guide to Hiring Data Scientists.**5 Things You Need To Know About Data Science**- Feb 19, 2018.

Here are 5 useful things to know about Data Science, including its relationship to BI, Data Mining, Predictive Analytics, and Machine Learning; Data Scientist job prospects; where to learn Data Science; and which algorithms/methods are used by Data Scientists**Logistic Regression: A Concise Technical Overview**- Feb 16, 2018.

Interested in learning the concepts behind Logistic Regression (LogR)? Looking for a concise introduction to LogR? This article is for you. Includes a Python implementation and links to an R script as well.**KDnuggets™ News 18:n07, Feb 14: 5 Machine Learning Projects You Should Not Overlook; Intro to Python Ensembles**- Feb 14, 2018.

5 Machine Learning Projects You Should Not Overlook; Introduction to Python Ensembles; Which Machine Learning Algorithm be used in year 2118?; Fast.ai Lesson 1 on Google Colab (Free GPU)**A Basic Recipe for Machine Learning**- Feb 13, 2018.

One of the gems that I felt needed to be written down from Ng's deep learning courses is his general recipe to approaching a deep learning algorithm/model.**Which Machine Learning Algorithm be used in year 2118?**- Feb 9, 2018.

So what were the answers popping in your head ? Random forest, SVM, K means, Knn or even Deep Learning? No, for the answer, we turn to Lindy Effect.**Top KDnuggets tweets, Jan 24-30: Top 10 Algorithms for Machine Learning Newbies; Want to Become a Data Scientist? Try Feynman Technique**- Jan 31, 2018.

Also: Chronological List of AI Books To Read - from Goedel, Escher, Bach ... ; Aspiring Data Scientists! Start to learn Statistics with these 6 books.**Topological Data Analysis for Data Professionals: Beyond Ayasdi**- Jan 16, 2018.

We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.**Quantum Machine Learning: An Overview**- Jan 5, 2018.

Quantum Machine Learning (Quantum ML) is the interdisciplinary area combining Quantum Physics and Machine Learning(ML). It is a symbiotic association- leveraging the power of Quantum Computing to produce quantum versions of ML algorithms, and applying classical ML algorithms to analyze quantum systems. Read this article for an introduction to Quantum ML.**How to Improve Machine Learning Algorithms? Lessons from Andrew Ng, part 2**- Dec 21, 2017.

The second chapter of ML lessons from Ng’s experience. This one will only be talking about Human Level Performance & Avoidable Bias.**Accelerating Algorithms: Considerations in Design, Algorithm Choice and Implementation**- Dec 18, 2017.

If you are trying to make your algorithms run faster, you may want to consider reviewing some important points on design and implementation.**Top Data Science and Machine Learning Methods Used in 2017**- Dec 11, 2017.

The most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests; Deep Learning is used by only 20% of respondents; we also analyze which methods are most "industrial" and most "academic".**New Poll: Which Data Science / Machine Learning methods and tools you used?**- Nov 20, 2017.

Please vote in new KDnuggets poll which examines the methods and tools used for a real-world application or project.**The 10 Statistical Techniques Data Scientists Need to Master**- Nov 15, 2017.

The author presents 10 statistical techniques which a data scientist needs to master. Build up your toolbox of data science tools by having a look at this great overview post.**Machine Learning Algorithms: Which One to Choose for Your Problem**- Nov 14, 2017.

This article will try to explain basic concepts and give some intuition of using different kinds of machine learning algorithms in different tasks. At the end of the article, you’ll find the structured overview of the main features of described algorithms.**Density Based Spatial Clustering of Applications with Noise (DBSCAN)**- Oct 26, 2017.

DBSCAN clustering can identify outliers, observations which won’t belong to any cluster. Since DBSCAN clustering identifies the number of clusters as well, it is very useful with unsupervised learning of the data when we don’t know how many clusters could be there in the data.**Top 10 Machine Learning with R Videos**- Oct 24, 2017.

A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.**Top 10 Machine Learning Algorithms for Beginners**- Oct 20, 2017.

A beginner's introduction to the Top 10 Machine Learning (ML) algorithms, complete with figures and examples for easy understanding.

**Random Forests®, Explained**- Oct 17, 2017.

Random Forest, one of the most popular and powerful ensemble method used today in Machine Learning. This post is an introduction to such algorithm and provides a brief overview of its inner workings.**5 overriding factors for the successful implementation of AI**- Oct 6, 2017.

Today AI is everywhere, from virtual assistants scheduling meetings, to facial recognition software and increasingly autonomous cars. We review 5 main factors for the successful AI implementation.**KDnuggets™ News 17:n38, Oct 4: What Blockchains Mean to Big Data; Keras Deep Learning Cheat Sheet; Machine Learning in Finance**- Oct 4, 2017.

Also: XGBoost, a Top Machine Learning Method on Kaggle, Explained; How to win Kaggle competition based on NLP task, if you are not an NLP expert; Fundamental Breakthrough in 2 Decade Old Algorithm Redefines Big Data Benchmarks**XGBoost, a Top Machine Learning Method on Kaggle, Explained**- Oct 3, 2017.

Looking to boost your machine learning competitions score? Here’s a brief summary and introduction to a powerful and popular tool among Kagglers, XGBoost.**Understanding Machine Learning Algorithms**- Oct 3, 2017.

Machine learning algorithms aren’t difficult to grasp if you understand the basic concepts. Here, a SAS data scientist describes the foundations for some of today’s popular algorithms.**Fundamental Breakthrough in 2 Decade Old Algorithm Redefines Big Data Benchmarks**- Sep 28, 2017.

Read on to find out how the two-decade-old minwise hashing computational barrier has been overcome with a significantly efficient alternative.**K-Nearest Neighbors – the Laziest Machine Learning Technique**- Sep 12, 2017.

K-Nearest Neighbors (K-NN) is one of the simplest machine learning algorithms. When a new situation occurs, it scans through all past experiences and looks up the k closest experiences. Those experiences (or: data points) are what we call the k nearest neighbors.**Search Millions of Documents for Thousands of Keywords in a Flash**- Sep 1, 2017.

We present a python library called FlashText that can search or replace keywords / synonyms in documents in O(n) – linear time.**Support Vector Machine (SVM) Tutorial: Learning SVMs From Examples**- Aug 28, 2017.

In this post, we will try to gain a high-level understanding of how SVMs work. I’ll focus on developing intuition rather than rigor. What that essentially means is we will skip as much of the math as possible and develop a strong intuition of the working principle.**How To Write Better SQL Queries: The Definitive Guide – Part 2**- Aug 24, 2017.

Most forget that SQL isn’t just about writing queries, which is just the first step down the road. Ensuring that queries are performant or that they fit the context that you’re working in is a whole other thing. This SQL tutorial will provide you with a small peek at some steps that you can go through to evaluate your query.**Recommendation System Algorithms: An Overview**- Aug 22, 2017.

This post presents an overview of the main existing recommendation system algorithms, in order for data scientists to choose the best one according a business’s limitations and requirements.**The Machine Learning Abstracts: Support Vector Machines**- Aug 10, 2017.

While earlier entrants in this series covered elementary classification algorithms, another (more advanced) machine learning algorithm which can be used for classification is Support Vector Machines (SVM).**Machine Learning Algorithms: A Concise Technical Overview – Part 1**- Aug 4, 2017.

These short and to-the-point tutorials may provide the assistance you are looking for. Each of these posts concisely covers a single, specific machine learning concept.**The Machine Learning Abstracts: Decision Trees**- Aug 3, 2017.

Decision trees are a classic machine learning technique. The basic intuition behind a decision tree is to map out all possible decision paths in the form of a tree.**The Machine Learning Abstracts: Classification**- Jul 27, 2017.

Classification is the process of categorizing or “classifying” some items into a predefined set of categories or “classes”. It is exactly the same even when a machine does so. Let’s dive a little deeper.**Design by Evolution: How to evolve your neural network with AutoML**- Jul 20, 2017.

The gist ( tl;dr): Time to evolve! I’m gonna give a basic example (in PyTorch) of using evolutionary algorithms to tune the hyper-parameters of a DNN.**The Machine Learning Algorithms Used in Self-Driving Cars**- Jun 19, 2017.

Machine Learning applications include evaluation of driver condition or driving scenario classification through data fusion from different external and internal sensors. We examine different algorithms used for self-driving cars.**Which Machine Learning Algorithm Should I Use?**- Jun 1, 2017.

A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is "which algorithm should I use?” The answer to the question varies depending on many factors, including the size, quality, and nature of data, the available computational time, and more.**Top KDnuggets tweets, May 10-16: Which Machine Learning algorithm should I use? #cheatsheet**- May 17, 2017.

Also HDFS vs. HBase: All you need to know #BigData mini-tutorial; #MachineLearning overtaking #BigData?**Keep it simple! How to understand Gradient Descent algorithm**- Apr 28, 2017.

In Data Science, Gradient Descent is one of the important and difficult concepts. Here we explain this concept with an example, in a very simple way. Check this out.**What Top Firms Ask: 100+ Data Science Interview Questions**- Mar 22, 2017.

Check this out: A topic wise collection of 100+ data science interview questions from top companies.**Getting Up Close and Personal with Algorithms**- Mar 21, 2017.

We've put together a brief summary of the top algorithms used in predictive analysis, which you can see just below. Read to learn more about Linear Regression, Logistic Regression, Decision Trees, Random Forests, Gradient Boosting, and more.**Toward Increased k-means Clustering Efficiency with the Naive Sharding Centroid Initialization Method**- Mar 13, 2017.

What if a simple, deterministic approach which did not rely on randomization could be used for centroid initialization? Naive sharding is such a method, and its time-saving and efficient results, though preliminary, are promising.**Netflix: Manager, Content Programming Science & Algorithms**- Feb 27, 2017.

Seeking a Manager of Content Programming Science & Algorithms, an experienced and entrepreneurial-minded data scientist. This is high-impact and challenging role, and will require both strong leadership and technical prowess.