# Tag: Algorithms (158)

**The 5 Sampling Algorithms every Data Scientist need to know**- Sep 18, 2019.

Algorithms are at the core of data science and sampling is a critical technical that can make or break a project. Learn more about the most common sampling techniques used, so you can select the best approach while working with your data.**There is No Free Lunch in Data Science**- Sep 12, 2019.

There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.**A Friendly Introduction to Support Vector Machines**- Sep 12, 2019.

This article explains the Support Vector Machines (SVM) algorithm in an easy way.**The 5 Graph Algorithms That Data Scientists Should Know**- Sep 10, 2019.

In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python.**Top KDnuggets tweets, Aug 21-27: Algorithms Notes for Professionals – Free Book**- Aug 28, 2019.

Algorithms Notes for Professionals - Free Book; 10 simple Linux tips which save 50% of my time in the command line; Why so many #DataScientists are leaving their jobs; Order Matters: Alibaba Transformer-based Recommender System**How to count Big Data: Probabilistic data structures and algorithms**- Aug 26, 2019.

Learn how probabilistic data structures and algorithms can be used for cardinality estimation in Big Data streams.**Automate Stacking In Python: How to Boost Your Performance While Saving Time**- Aug 21, 2019.

Utilizing stacking (stacked generalizations) is a very hot topic when it comes to pushing your machine learning algorithm to new heights. For instance, most if not all winning Kaggle submissions nowadays make use of some form of stacking or a variation of it.**Coding Random Forests in 100 lines of code***- Aug 7, 2019.

There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.**An Overview of Outlier Detection Methods from PyOD – Part 1**- Jun 27, 2019.

PyOD is an outlier detection package developed with a comprehensive API to support multiple techniques. This post will showcase Part 1 of an overview of techniques that can be used to analyze anomalies in data.**10 Gradient Descent Optimisation Algorithms + Cheat Sheet**- Jun 26, 2019.

Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.**The Machine Learning Puzzle, Explained**- Jun 17, 2019.

Lots of moving parts go into creating a machine learning model. Let's take a look at some of these core concepts and see how the machine learning puzzle comes together.**Think Like an Amateur, Do As an Expert: Lessons from a Career in Computer Vision**- May 17, 2019.

Dr. Takeo Kanade shared his life lessons from an illustrious 50-year career in Computer Vision at last year's Embedded Vision Summit. You have a chance to attend the 2019 Embedded Vision Summit, from May 20-23, in the Santa Clara Convention Center, Santa Clara CA.**Naive Bayes: A Baseline Model for Machine Learning Classification Performance**- May 7, 2019.

We can use Pandas to conduct Bayes Theorem and Scikitlearn to implement the Naive Bayes Algorithm. We take a step by step approach to understand Bayes and implementing the different options in Scikitlearn.**KDnuggets™ News 19:n17, May 1: The most desired skill in data science; Seeking KDnuggets Editors, work remotely**- May 1, 2019.

This week, find out about the most desired skill in data science, learn which projects to include in your portfolio, identify a single strategy for pulling data from a Pandas DataFrame (once and for all), read the results of our Top Data Science and Machine Learning Methods poll, and much more.**Top Data Science and Machine Learning Methods Used in 2018, 2019**- Apr 29, 2019.

Once again, the most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests. The greatest relative increases this year are overwhelmingly Deep Learning techniques, while SVD, SVMs and Association Rules show the greatest decline.**How Machines Make Sense of Big Data: an Introduction to Clustering Algorithms**- Apr 16, 2019.

We outline three different clustering algorithms - k-means clustering, hierarchical clustering and Graph Community Detection - providing an explanation on when to use each, how they work and a worked example.**Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application?**- Apr 9, 2019.

Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application? Take part in the latest KDnuggets survey and have your say.**Artificial Neural Networks Optimization using Genetic Algorithm with Python**- Mar 18, 2019.

This tutorial explains the usage of the genetic algorithm for optimizing the network weights of an Artificial Neural Network for improved performance.**Designing Ethical Algorithms**- Mar 8, 2019.

Ethical algorithm design is becoming a hot topic as machine learning becomes more widespread. But how do you make an algorithm ethical? Here are 5 suggestions to consider.**The Algorithms Aren’t Biased, We Are**- Jan 29, 2019.

We explain the concept of bias and how it can appear in your projects, share some illustrative examples, and translate the latest academic research on “algorithmic bias.”**Data Science and Ethics – Why Companies Need a new CEO (Chief Ethics Officer)**- Jan 21, 2019.

We explain why data science companies need to have a Chief Ethics Officer and discuss their importance in tackling algorithm bias.**A Guide to Decision Trees for Machine Learning and Data Science**- Dec 24, 2018.

What makes decision trees special in the realm of ML models is really their clarity of information representation. The “knowledge” learned by a decision tree through training is directly formulated into a hierarchical structure.**10 More Must-See Free Courses for Machine Learning and Data Science**- Dec 20, 2018.

Have a look at this follow-up collection of free machine learning and data science courses to give you some winter study ideas.**Machine learning — Is the emperor wearing clothes?**- Oct 12, 2018.

We take a look at the core concepts of Machine Learning, including the data, algorithm and optimization needed to get you started, with links to additional resources to help enhance your knowledge.**KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild**- Oct 10, 2018.

This week, KDnuggets brings you a discussion of learning algorithms with a hat tip to Tom Mitchell, discusses why you might call yourself a data scientist, explores machine learning in the wild, checks out some top trends in deep learning, shows you how to learn data science if you are low on finances, and puts forth one person's opinion on the top 8 Python machine learning libraries to help get the job done.**A Concise Explanation of Learning Algorithms with the Mitchell Paradigm**- Oct 5, 2018.

A single quote from Tom Mitchell can shed light on both the abstract concept and concrete implementations of machine learning algorithms.**Linear Regression in the Wild**- Oct 3, 2018.

We take a look at how to use linear regression when the dependent variables have measurement errors.**What If the Data Tells You to Be Racist? When Algorithms Explicitly Penalize**- Sep 26, 2018.

Without the right precautions, machine learning — the technology that drives risk-assessment in law enforcement, as well as hiring and loan decisions — explicitly penalizes underprivileged groups.**KDnuggets™ News 18:n36, Sep 26: Machine Learning Algorithms From Scratch; Deep Learning Framework Popularity; Data Capture, the Deep Learning Way**- Sep 26, 2018.

Also: SQL Case Study: Helping a Startup CEO Manage His Data; Building a Machine Learning Model through Trial and Error; The Whys and Hows of Web Scraping; Unfolding Naive Bayes From Scratch; "Auto-What?" - A Taxonomy of Automated Machine Learning**Selecting the Best Machine Learning Algorithm for Your Regression Problem**- Aug 1, 2018.

This post should then serve as a great aid in selecting the best ML algorithm for you regression problem!**Weapons of Math Destruction, Ethical Matrix, Nate Silver and more Highlights from the Data Science Leaders Summit**- Jul 31, 2018.

Domino Data Lab hosted its first ever Data Science Leaders Summit at the lovely Yerba Buena Center for the Arts in San Francisco on May 30-31, 2018. Cathy O'Neil, Nate Silver, Cassie Kozyrkov and Eric Colson were some of the speakers at this event.**Genetic Algorithm Implementation in Python**- Jul 24, 2018.

This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation.**Clustering Using K-means Algorithm**- Jul 18, 2018.

This article explains K-means algorithm in an easy way. I’d like to start with an example to understand the objective of this powerful technique in machine learning before getting into the algorithm, which is quite simple.**Deep Learning and Challenges of Scale Webinar**- Jul 9, 2018.

Join Nvidia for an on-demand webinar to learn how to tackle the challenges of scaling and building complex deep learning systems.**KDnuggets™ News 18:n16, Apr 18: Key Algorithms and Statistical Models; Don’t learn Machine Learning in 24 hours; Data Scientist among the best US Jobs in 2018**- Apr 18, 2018.

Also: Top 10 Technology Trends of 2018; 12 Useful Things to Know About Machine Learning; Robust Word2Vec Models with Gensim & Applying Word2Vec Features for Machine Learning Tasks; Understanding What is Behind Sentiment Analysis - Part 1; Getting Started with PyTorch**Key Algorithms and Statistical Models for Aspiring Data Scientists**- Apr 16, 2018.

This article provides a summary of key algorithms and statistical techniques commonly used in industry, along with a short resource related to these techniques.**Ten Machine Learning Algorithms You Should Know to Become a Data Scientist**- Apr 11, 2018.

It's important for data scientists to have a broad range of knowledge, keeping themselves updated with the latest trends. With that being said, we take a look at the top 10 machine learning algorithms every data scientist should know.**Genetic Algorithm Key Terms, Explained**- Apr 10, 2018.

This article presents simple definitions for 12 genetic algorithm key terms, in order to help better introduce the concepts to newcomers.**Top 20 Deep Learning Papers, 2018 Edition**- Apr 3, 2018.

Deep Learning is constantly evolving at a fast pace. New techniques, tools and implementations are changing the field of Machine Learning and bringing excellent results.**Multiscale Methods and Machine Learning**- Mar 19, 2018.

We highlight recent developments in machine learning and Deep Learning related to multiscale methods, which analyze data at a variety of scales to capture a wider range of relevant features. We give a general overview of multiscale methods, examine recent successes, and compare with similar approaches.**KDnuggets™ News 18:n09, Feb 28: Gartner 2018 MQ for Data Science/ML – Gainers and Losers; Comparative Analysis of Top 6 BI/Data Viz Tools**- Feb 28, 2018.

A Comparative Analysis of Top 6 BI and Data Visualization Tools; A Tour of The Top 10 Algorithms for Machine Learning Newbies; A Guide to Hiring Data Scientists.**5 Things You Need To Know About Data Science**- Feb 19, 2018.

Here are 5 useful things to know about Data Science, including its relationship to BI, Data Mining, Predictive Analytics, and Machine Learning; Data Scientist job prospects; where to learn Data Science; and which algorithms/methods are used by Data Scientists**Logistic Regression: A Concise Technical Overview**- Feb 16, 2018.

Interested in learning the concepts behind Logistic Regression (LogR)? Looking for a concise introduction to LogR? This article is for you. Includes a Python implementation and links to an R script as well.**KDnuggets™ News 18:n07, Feb 14: 5 Machine Learning Projects You Should Not Overlook; Intro to Python Ensembles**- Feb 14, 2018.

5 Machine Learning Projects You Should Not Overlook; Introduction to Python Ensembles; Which Machine Learning Algorithm be used in year 2118?; Fast.ai Lesson 1 on Google Colab (Free GPU)**A Basic Recipe for Machine Learning**- Feb 13, 2018.

One of the gems that I felt needed to be written down from Ng's deep learning courses is his general recipe to approaching a deep learning algorithm/model.**Which Machine Learning Algorithm be used in year 2118?**- Feb 9, 2018.

So what were the answers popping in your head ? Random forest, SVM, K means, Knn or even Deep Learning? No, for the answer, we turn to Lindy Effect.**Top KDnuggets tweets, Jan 24-30: Top 10 Algorithms for Machine Learning Newbies; Want to Become a Data Scientist? Try Feynman Technique**- Jan 31, 2018.

Also: Chronological List of AI Books To Read - from Goedel, Escher, Bach ... ; Aspiring Data Scientists! Start to learn Statistics with these 6 books.**Topological Data Analysis for Data Professionals: Beyond Ayasdi**- Jan 16, 2018.

We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.**Quantum Machine Learning: An Overview**- Jan 5, 2018.

Quantum Machine Learning (Quantum ML) is the interdisciplinary area combining Quantum Physics and Machine Learning(ML). It is a symbiotic association- leveraging the power of Quantum Computing to produce quantum versions of ML algorithms, and applying classical ML algorithms to analyze quantum systems. Read this article for an introduction to Quantum ML.**How to Improve Machine Learning Algorithms? Lessons from Andrew Ng, part 2**- Dec 21, 2017.

The second chapter of ML lessons from Ng’s experience. This one will only be talking about Human Level Performance & Avoidable Bias.**Accelerating Algorithms: Considerations in Design, Algorithm Choice and Implementation**- Dec 18, 2017.

If you are trying to make your algorithms run faster, you may want to consider reviewing some important points on design and implementation.**Top Data Science and Machine Learning Methods Used in 2017**- Dec 11, 2017.

The most used methods are Regression, Clustering, Visualization, Decision Trees/Rules, and Random Forests; Deep Learning is used by only 20% of respondents; we also analyze which methods are most "industrial" and most "academic".**New Poll: Which Data Science / Machine Learning methods and tools you used?**- Nov 20, 2017.

Please vote in new KDnuggets poll which examines the methods and tools used for a real-world application or project.**The 10 Statistical Techniques Data Scientists Need to Master**- Nov 15, 2017.

The author presents 10 statistical techniques which a data scientist needs to master. Build up your toolbox of data science tools by having a look at this great overview post.**Machine Learning Algorithms: Which One to Choose for Your Problem**- Nov 14, 2017.

This article will try to explain basic concepts and give some intuition of using different kinds of machine learning algorithms in different tasks. At the end of the article, you’ll find the structured overview of the main features of described algorithms.**Density Based Spatial Clustering of Applications with Noise (DBSCAN)**- Oct 26, 2017.

DBSCAN clustering can identify outliers, observations which won’t belong to any cluster. Since DBSCAN clustering identifies the number of clusters as well, it is very useful with unsupervised learning of the data when we don’t know how many clusters could be there in the data.**Top 10 Machine Learning with R Videos**- Oct 24, 2017.

A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.**Top 10 Machine Learning Algorithms for Beginners**- Oct 20, 2017.

A beginner's introduction to the Top 10 Machine Learning (ML) algorithms, complete with figures and examples for easy understanding.

**Random Forests(r), Explained**- Oct 17, 2017.

Random Forest, one of the most popular and powerful ensemble method used today in Machine Learning. This post is an introduction to such algorithm and provides a brief overview of its inner workings.**5 overriding factors for the successful implementation of AI**- Oct 6, 2017.

Today AI is everywhere, from virtual assistants scheduling meetings, to facial recognition software and increasingly autonomous cars. We review 5 main factors for the successful AI implementation.**KDnuggets™ News 17:n38, Oct 4: What Blockchains Mean to Big Data; Keras Deep Learning Cheat Sheet; Machine Learning in Finance**- Oct 4, 2017.

Also: XGBoost, a Top Machine Learning Method on Kaggle, Explained; How to win Kaggle competition based on NLP task, if you are not an NLP expert; Fundamental Breakthrough in 2 Decade Old Algorithm Redefines Big Data Benchmarks**XGBoost, a Top Machine Learning Method on Kaggle, Explained**- Oct 3, 2017.

Looking to boost your machine learning competitions score? Here’s a brief summary and introduction to a powerful and popular tool among Kagglers, XGBoost.**Understanding Machine Learning Algorithms**- Oct 3, 2017.

Machine learning algorithms aren’t difficult to grasp if you understand the basic concepts. Here, a SAS data scientist describes the foundations for some of today’s popular algorithms.**Fundamental Breakthrough in 2 Decade Old Algorithm Redefines Big Data Benchmarks**- Sep 28, 2017.

Read on to find out how the two-decade-old minwise hashing computational barrier has been overcome with a significantly efficient alternative.**K-Nearest Neighbors – the Laziest Machine Learning Technique**- Sep 12, 2017.

K-Nearest Neighbors (K-NN) is one of the simplest machine learning algorithms. When a new situation occurs, it scans through all past experiences and looks up the k closest experiences. Those experiences (or: data points) are what we call the k nearest neighbors.**Search Millions of Documents for Thousands of Keywords in a Flash**- Sep 1, 2017.

We present a python library called FlashText that can search or replace keywords / synonyms in documents in O(n) – linear time.**Support Vector Machine (SVM) Tutorial: Learning SVMs From Examples**- Aug 28, 2017.

In this post, we will try to gain a high-level understanding of how SVMs work. I’ll focus on developing intuition rather than rigor. What that essentially means is we will skip as much of the math as possible and develop a strong intuition of the working principle.**How To Write Better SQL Queries: The Definitive Guide – Part 2**- Aug 24, 2017.

Most forget that SQL isn’t just about writing queries, which is just the first step down the road. Ensuring that queries are performant or that they fit the context that you’re working in is a whole other thing. This SQL tutorial will provide you with a small peek at some steps that you can go through to evaluate your query.**Recommendation System Algorithms: An Overview**- Aug 22, 2017.

This post presents an overview of the main existing recommendation system algorithms, in order for data scientists to choose the best one according a business’s limitations and requirements.**The Machine Learning Abstracts: Support Vector Machines**- Aug 10, 2017.

While earlier entrants in this series covered elementary classification algorithms, another (more advanced) machine learning algorithm which can be used for classification is Support Vector Machines (SVM).**Machine Learning Algorithms: A Concise Technical Overview – Part 1**- Aug 4, 2017.

These short and to-the-point tutorials may provide the assistance you are looking for. Each of these posts concisely covers a single, specific machine learning concept.**The Machine Learning Abstracts: Decision Trees**- Aug 3, 2017.

Decision trees are a classic machine learning technique. The basic intuition behind a decision tree is to map out all possible decision paths in the form of a tree.**The Machine Learning Abstracts: Classification**- Jul 27, 2017.

Classification is the process of categorizing or “classifying” some items into a predefined set of categories or “classes”. It is exactly the same even when a machine does so. Let’s dive a little deeper.**Design by Evolution: How to evolve your neural network with AutoML**- Jul 20, 2017.

The gist ( tl;dr): Time to evolve! I’m gonna give a basic example (in PyTorch) of using evolutionary algorithms to tune the hyper-parameters of a DNN.**The Machine Learning Algorithms Used in Self-Driving Cars**- Jun 19, 2017.

Machine Learning applications include evaluation of driver condition or driving scenario classification through data fusion from different external and internal sensors. We examine different algorithms used for self-driving cars.**Which Machine Learning Algorithm Should I Use?**- Jun 1, 2017.

A typical question asked by a beginner, when facing a wide variety of machine learning algorithms, is "which algorithm should I use?” The answer to the question varies depending on many factors, including the size, quality, and nature of data, the available computational time, and more.**Top KDnuggets tweets, May 10-16: Which Machine Learning algorithm should I use? #cheatsheet**- May 17, 2017.

Also HDFS vs. HBase: All you need to know #BigData mini-tutorial; #MachineLearning overtaking #BigData?**Keep it simple! How to understand Gradient Descent algorithm**- Apr 28, 2017.

In Data Science, Gradient Descent is one of the important and difficult concepts. Here we explain this concept with an example, in a very simple way. Check this out.**What Top Firms Ask: 100+ Data Science Interview Questions**- Mar 22, 2017.

Check this out: A topic wise collection of 100+ data science interview questions from top companies.**Getting Up Close and Personal with Algorithms**- Mar 21, 2017.

We've put together a brief summary of the top algorithms used in predictive analysis, which you can see just below. Read to learn more about Linear Regression, Logistic Regression, Decision Trees, Random Forests, Gradient Boosting, and more.**Toward Increased k-means Clustering Efficiency with the Naive Sharding Centroid Initialization Method**- Mar 13, 2017.

What if a simple, deterministic approach which did not rely on randomization could be used for centroid initialization? Naive sharding is such a method, and its time-saving and efficient results, though preliminary, are promising.**Netflix: Manager, Content Programming Science & Algorithms**- Feb 27, 2017.

Seeking a Manager of Content Programming Science & Algorithms, an experienced and entrepreneurial-minded data scientist. This is high-impact and challenging role, and will require both strong leadership and technical prowess.**17 More Must-Know Data Science Interview Questions and Answers, Part 2**- Feb 22, 2017.

The second part of 17 new must-know Data Science Interview questions and answers covers overfitting, ensemble methods, feature selection, ground truth in unsupervised learning, the curse of dimensionality, and parallel algorithms.**Top KDnuggets tweets, Jan 25-31: Python implementations of Andrew Ng #MachineLearning MOOC exercises**- Feb 1, 2017.

#Python implementations of Andrew Ng #MachineLearning MOOC exercises; This repository contains the entire #Python #DataScience Handbook; What are the best #visualizations of #MachineLearning algorithms? Learn #TensorFlow and #DeepLearning, without a PhD.**KDnuggets™ News 17:n04, Feb 1: Data Science and Python Wrangling: Pandas Cheat Sheet; Great Collection of Machine Learning Algorithms**- Feb 1, 2017.

Also Great Collection of Minimal and Clean Implementations of Machine Learning Algorithms; Bad Data + Good Models = Bad Results; Data Scientist - best job in America, again.**Great Collection of Minimal and Clean Implementations of Machine Learning Algorithms**- Jan 25, 2017.

Interested in learning machine learning algorithms by implementing them from scratch? Need a good set of examples to work from? Check out this post with links to minimal and clean implementations of various algorithms.**Zoll LifeVest: Advisory Researcher, Predictive Algorithms**- Jan 18, 2017.

Seeking individuals to conduct applied research on predictive algorithms to advance the company strategy of improving outcomes for patients at risk of Sudden Cardiac Arrest, collaborating with teams of Physicians, Software Engineers, Data Scientists, and Machine Learning Specialists.**More Data or Better Algorithms: The Sweet Spot**- Jan 17, 2017.

We examine the sweet spot for data-driven Machine Learning companies, where is not too easy and not too hard to collect the needed data.**Random Forests in Python**- Dec 2, 2016.

Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.**Ethical Implications Of Industrialized Analytics**- Nov 29, 2016.

Analytics & Big Data will be involved in every aspect of our lives and we should handle the ethical dilemmas wisely to let innovation contribute more to our lives.**Linear Regression, Least Squares & Matrix Multiplication: A Concise Technical Overview**- Nov 24, 2016.

Linear regression is a simple algebraic tool which attempts to find the “best” line fitting 2 or more attributes. Read here to discover the relationship between linear regression, the least squares method, and matrix multiplication.**Predictive Science vs Data Science**- Nov 22, 2016.

Is Predictive Science accurately represented by the term Data Science? As a matter of fact, are any of Data Science's constituent sciences well-represented by the umbrella term? This post discusses a few of these points at a high level.**The Foundations of Algorithmic Bias**- Nov 16, 2016.

We might hope that algorithmic decision making would be free of biases. But increasingly, the public is starting to realize that machine learning systems can exhibit these same biases and more. In this post, we look at precisely how that happens.**Parallelism in Machine Learning: GPUs, CUDA, and Practical Applications**- Nov 10, 2016.

The lack of parallel processing in machine learning tasks inhibits economy of performance, yet it may very well be worth the trouble. Read on for an introductory overview to GPU-based parallelism, the CUDA framework, and some thoughts on practical implementation.**Using Predictive Algorithms to Track Real Time Health Trends**- Nov 4, 2016.

How to build a real-time health dashboard for tracking a person blood pressure readings, do time series analysis, and then graph the trends over time using predictive algorithms.**Decision Tree Classifiers: A Concise Technical Overview**- Nov 3, 2016.

The decision tree is one of the oldest and most intuitive classification algorithms in existence. This post provides a straightforward technical overview of this brand of classifiers.**KDnuggets™ News 16:n39, Nov 2: Machine Learning: A Complete and Detailed Overview; Learn Data Science in 8 (Easy) Steps**- Nov 2, 2016.

Machine Learning: A Complete and Detailed Overview; Cartoon: Scary Big Data; Learn Data Science in 8 (Easy) Steps; Is Your Code Good Enough to Call Yourself a Data Scientist?; Using Machine Learning to Detect Malicious URLs; Frequent Pattern Mining and the Apriori Algorithm**Frequent Pattern Mining and the Apriori Algorithm: A Concise Technical Overview**- Oct 27, 2016.

This post provides a technical overview of frequent pattern mining algorithms (also known by a variety of other names), along with its most famous implementation, the Apriori algorithm.**Top KDnuggets tweets, Oct 12-18: #DeepLearning Key Terms, Explained; Free Foundations of #DataScience text PDF**- Oct 20, 2016.

#DeepLearning Key Terms, Explained; Free Foundations of #DataScience text PDF; Top 12 Interesting Careers to Explore in #BigData; #ICYMI The 10 Algorithms #MachineLearning Engineers Need to Know**Intellectual Ventures Lab: Sr. Machine Learning Algorithm Development Software Engineer**- Oct 18, 2016.

Seeking a Senior Machine-Learning Algorithm Development Software Engineer to provide technical leadership to fast-paced machine-learning development projects.**Rexer Analytics Data Science Survey Highlights**- Oct 14, 2016.

Regression, Decision Trees, and Cluster analysis remain the most commonly used algorithms in the field, R continues to ascend, job satisfaction remains high, but customer understanding still needs improvement.**Top Data Scientist Claudia Perlich’s Favorite Machine Learning Algorithm**- Sep 27, 2016.

Interested in the reasons why a top data scientist is partial to one particular algorithm over others? Read on to find out.**Comparing Clustering Techniques: A Concise Technical Overview**- Sep 26, 2016.

A wide array of clustering techniques are in use today. Given the widespread use of clustering in everyday data mining, this post provides a concise technical overview of 2 such exemplar techniques.**Data Science Basics: 3 Insights for Beginners**- Sep 22, 2016.

For data science beginners, 3 elementary issues are given overview treatment: supervised vs. unsupervised learning, decision tree pruning, and training vs. testing datasets.**Support Vector Machines: A Concise Technical Overview**- Sep 21, 2016.

Support Vector Machines remain a popular and time-tested classification algorithm. This post provides a high-level concise technical overview of their functionality.**KDnuggets™ News 16:n34, Sep 21: The Great Algorithm Tutorial Roundup; 7 Steps to Mastering Apache Spark 2.0**- Sep 21, 2016.

The Great Algorithm Tutorial Roundup; 7 Steps to Mastering Apache Spark 2.0; Machine Learning in a Year: From Total Noob to Effective Practitioner; Learning From Data (Introductory Machine Learning) Caltech MOOC**The Great Algorithm Tutorial Roundup**- Sep 20, 2016.

This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!**KDnuggets™ News 16:n33, Sep 14: Top Algorithms Used by Data Scientists; (Not So) New Data Scientist Venn Diagram**- Sep 14, 2016.

Top Algorithms Used by Data Scientists; Guide To Understanding Convolutional Neural Nets; The (Not So) New Data Scientist Venn Diagram; Deep Learning Networks with Stochastic Depth.**Top Algorithms and Methods Used by Data Scientists**- Sep 12, 2016.

Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.**New Poll: Which methods/algorithms you used for a Data Science or Machine Learning application?**- Aug 26, 2016.

Which methods/approaches you used in the past 12 months for an actual Data Science-related application? Please vote and we will analyze and publish the results.**Introduction to Local Interpretable Model-Agnostic Explanations (LIME)**- Aug 25, 2016.

Learn about LIME, a technique to explain the predictions of any machine learning classifier.**A Gentle Introduction to Bloom Filter**- Aug 24, 2016.

The Bloom Filter is a probabilistic data structure which can make a tradeoff between space and false positive rate. Read more, and see an implementation from scratch, in this post.**KDnuggets™ News 16:n31, Aug 24: 10 Algo Machine Learning Engineers Need to Know; How to Become a Data Scientist; Gentle Tensorflow**- Aug 24, 2016.

The 10 Algorithms Machine Learning Engineers Need to Know; How to Become a Data Scientist - Part 1; The Gentlest Introduction to Tensorflow - Part 1; Approaching (Almost) Any Machine Learning Problem.**The 10 Algorithms Machine Learning Engineers Need to Know**- Aug 18, 2016.

Read this introductory list of contemporary machine learning algorithms of importance that every engineer should understand.**Understanding the Empirical Law of Large Numbers and the Gambler’s Fallacy**- Aug 12, 2016.

Law of large numbers is a important concept for practising data scientists. In this post, The empirical law of large numbers is demonstrated via simple simulation approach using the Bernoulli process.**Contest 2nd Place: Automating Data Science**- Aug 3, 2016.

This post discusses some considerations, options, and opportunities for automating aspects of data science and machine learning. It is the second place recipient (tied) in the recent KDnuggets blog contest.**10 Algorithm Categories for AI, Big Data, and Data Science**- Jul 14, 2016.

With a focus on leveraging algorithms and balancing human and AI capital, here are the top 10 algorithm categories used to implement A.I., Big Data, and Data Science.**Improving Nudity Detection and NSFW Image Recognition**- Jun 25, 2016.

This post discussed improvements made in a tricky machine learning classification problem: nude and/or NSFW, or not?**Machine Learning Trends and the Future of Artificial Intelligence**- Jun 22, 2016.

The confluence of data flywheels, the algorithm economy, and cloud-hosted intelligence means every company can now be a data company, every company can now access algorithmic intelligence, and every app can now be an intelligent app.**A Visual Explanation of the Back Propagation Algorithm for Neural Networks**- Jun 17, 2016.

A concise explanation of backpropagation for neural networks is presented in elementary terms, along with explanatory visualization.**Figuring Out the Algorithms of Intelligence**- Jun 15, 2016.

Marvin Minsky, the father of AI, passed away this year. One of his inventions was the confocal microscope, which we used to take this high-resolution picture of a live brain circuit. Something in these cells allows them to automatically identify useful connections and establish useful networks out of information.**Deep Learning, Pachinko, and James Watt: Efficiency is the Driver of Uncertainty**- Jun 8, 2016.

A reasoned discussion of why the next generation of data efficient learning approaches rely on us developing new algorithms that can propagate stochasticity or uncertainty right through the model, and which are mathematically more involved than the standard approaches.**Data Science of Variable Selection: A Review**- Jun 7, 2016.

There are as many approaches to selecting features as there are statisticians since every statistician and their sibling has a POV or a paper on the subject. This is an overview of some of these approaches.**The Truth About Deep Learning**- Jun 6, 2016.

An honest look at deep learning, what it is**not**, its advantages over "shallow" neural networks, and some of the common assumptions and conflations that surround it.**KDnuggets™ News 16:n17, May 11: Machine Learning Algorithms From Scratch; Big Data Leaders on LinkedIn; Data Science Career**- May 11, 2016.

Why Implement Machine Learning Algorithms From Scratch? Meet the 11 Big Data, Data Science Leaders on LinkedIn; Free Advice For Building Your Data Science Career; 10 Essential Books for Data Enthusiast.**Why Implement Machine Learning Algorithms From Scratch?**- May 6, 2016.

Even with machine learning libraries covering almost any algorithm implementation you could imagine, there are often still good reasons to write your own. Read on to find out what these reasons are.**KDnuggets™ News 16:n16, May 4: How to Remove Duplicates from Large Data; Datasets over Algorithms; When Automation goes too far**- May 4, 2016.

How to Remove Duplicates in Large Datasets; The Development of Classification as a Learning Machine; Datasets Over Algorithms; Cartoon: When Automation Goes Too Far, and more.**Datasets Over Algorithms**- May 3, 2016.

The average elapsed time between key algorithm proposals and corresponding advances is about 18 years; the average elapsed time between key dataset availabilities and corresponding advances is less than 3 years, 6 times faster.**Metrics Gone Wrong – How Companies Are Optimizing The Wrong Way**- Apr 20, 2016.

A critique of the over-abundant and misguided pursuit of metric completeness, and how it can result in incorrect "optimization."**Basics of GPU Computing for Data Scientists**- Apr 7, 2016.

With the rise of neural network in data science, the demand for computationally extensive machines lead to GPUs. Learn how you can get started with GPUs & algorithms which could leverage them.