- Keras Hyperparameter Tuning in Google Colab Using Hyperas - Dec 12, 2018.
In this post, I will show you how you can tune the hyperparameters of your existing keras models using Hyperas and run everything in a Google Colab Notebook.
Automated Machine Learning, Google, Google Colab, Hyperparameter, Keras, Python
- KDnuggets™ News 18:n47, Dec 12: Common mistakes when doing machine learning; Here are the most popular Python IDEs / Editors - Dec 12, 2018.
Common mistakes when carrying out machine learning and data science; Most popular Python IDEs/Editors; Machine Learning / AI Main Developments in 2018 and Key Trends for 2019; Machine Learning Project checklist.
Data Science, Machine Learning, Mistakes, Python, Trends
- Introduction to Named Entity Recognition - Dec 11, 2018.
Named Entity Recognition is a tool which invariably comes handy when we do Natural Language Processing tasks. Read on to find out how.
Pages: 1 2
NLP, Python, Text Classification
Here are the most popular Python IDEs / Editors - Dec 7, 2018.
We report on the most popular IDE and Editors, based on our poll. Jupyter is the favorite across all regions and employment types, but there is competition for no. 2 and no. 3 spots.
IDE, Jupyter, Poll, Programming, PyCharm, Python, Visual Studio Code
- Four Techniques for Outlier Detection - Dec 6, 2018.
There are many techniques to detect and optionally remove outliers from a dataset. In this blog post, we show an implementation in KNIME Analytics Platform of four of the most frequently used - traditional and novel - techniques for outlier detection.
DBSCAN, Knime, Outliers, Python
- The Quick Python Book - Dec 5, 2018.
Written for programmers new to Python, this latest edition includes new exercises throughout. It covers features common to other languages concisely, while introducing Python's comprehensive standard functions library and unique features in detail.
Book, Manning, Python
- Handling Imbalanced Datasets in Deep Learning - Dec 4, 2018.
It’s important to understand why we should do it so that we can be sure it’s a valuable investment. Class balancing techniques are only really necessary when we actually care about the minority classes.
Balancing Classes, Datasets, Deep Learning, Keras, Python
Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools - Dec 3, 2018.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.
Big Data, Data Visualization, Deep Learning, Jupyter, Machine Learning, Python, R, Tableau
- Free ebook: Exploring Data with Python - Nov 29, 2018.
This free eBook starts building your foundation in data science processes with practical Python tips and techniques for working and aspiring data scientists.
Data Science, Free ebook, Manning, Python
- Sales Forecasting Using Facebook’s Prophet - Nov 28, 2018.
In this tutorial we’ll use Prophet, a package developed by Facebook to show how one can achieve this.
Facebook, Python, Sales, Time Series
- KDnuggets™ News 18:n45, Nov 28: Your Favorite Python IDE/editor? Intro to Data Science for Managers - Nov 28, 2018.
Also: 6 Goals Every Wannabe Data Scientist Should Make for 2019; Using a Keras Long Short-Term Memory (LSTM) Model to Predict Stock Prices.
Data Science Skills, Keras, LSTM, Python
- SQL, Python, and R in One Platform - Nov 27, 2018.
Stop jumping between applications. Get a complete analytical toolkit.
Data Science Platform, Data Visualization, Mode Analytics, Python, R, SQL
- What Python editors or IDEs you used the most in 2018? - Nov 27, 2018.
Vote in the new KDnuggets Poll - what are your favorite Python editors or IDEs?
Poll, Python
- KDnuggets™ News 18:n44, Nov 21: What is the Best Python IDE for Data Science?; Anticipating the next move in data science - Nov 21, 2018.
Also: Mastering The New Generation of Gradient Boosting; Top 10 Python Data Science Libraries; Predictive Analytics in 2018: Salaries & Industry Shifts; Sorry I didn't get that! How to understand what your users want; Best Deals in Deep Learning Cloud Providers: From CPU to GPU to TPU
Cloud, Data Science, Gradient Boosting, Interpretability, Machine Learning, Predictive Analytics, Python
- Word Morphing – an original idea - Nov 20, 2018.
In this post, we describe how to utilise word2vec's embeddings and A* search algorithm to morph between words.
NLP, Python, Text Classification
- Top 10 Python Data Science Libraries - Nov 16, 2018.
The third part of our series investigating the top Python Libraries across Machine Learning, AI, Deep Learning and Data Science.
Data Science, GitHub, numpy, Pandas, Python, StatsModels
- Mastering The New Generation of Gradient Boosting - Nov 15, 2018.
Catboost, the new kid on the block, has been around for a little more than a year now, and it is already threatening XGBoost, LightGBM and H2O.
Boosting, Gradient Boosting, Machine Learning, Python
What is the Best Python IDE for Data Science? - Nov 14, 2018.
Before you start learning Python, choose the IDE that suits you the best. We examine many available tools, their pros and cons, and suggest how to choose the best Python IDE for you.
Data Science, IDE, Jupyter, Programming, Python
- Healthcare Analytics Made Simple - Nov 12, 2018.
Finally, a book on Python healthcare machine learning techniques is here! Healthcare Analytics Made Simple does just what the title says: it makes healthcare data science simple and approachable for everyone.
Analytics, Book, Healthcare, Pandas, Python
- Multi-Class Text Classification with Doc2Vec & Logistic Regression - Nov 9, 2018.
Doc2vec is an NLP tool for representing documents as a vector and is a generalizing of the word2vec method. In order to understand doc2vec, it is advisable to understand word2vec approach.
Logistic Regression, NLP, Python, Text Classification
- Introduction to PyTorch for Deep Learning - Nov 7, 2018.
In this tutorial, you’ll get an introduction to deep learning using the PyTorch framework, and by its conclusion, you’ll be comfortable applying it to your deep learning models.
Deep Learning, Neural Networks, Python, PyTorch
- KDnuggets™ News 18:n42, Nov 7: The Most in Demand Skills for Data Scientists; How Machines Understand Our Language: Intro to NLP - Nov 7, 2018.
Also: Machine Learning Classification: A Dataset-based Pictorial; Quantum Machine Learning: A look at myths, realities, and future projections; Multi-Class Text Classification Model Comparison and Selection; Top 13 Python Deep Learning Libraries
Classification, Data Science Skills, Deep Learning, Myths, NLP, Python, Text Classification
- Text Preprocessing in Python: Steps, Tools, and Examples - Nov 6, 2018.
We outline the basic steps of text preprocessing, which are needed for transferring text from human language to machine-readable format for further processing. We will also discuss text preprocessing tools.
Pages: 1 2
Data Preparation, NLP, Python, Text Analysis, Text Mining, Tokenization
- Building Surveillance System Using USB Camera and Wireless-Connected Raspberry Pi - Nov 6, 2018.
Read this post to learn how to build a surveillance system using a USB camera plugged into Raspberry Pi (RPi) which is connected a PC using its wireless interface.
Pages: 1 2
Computer Vision, Python, Raspberry Pi, Security, Video recognition
- Quantum Machine Learning: A look at myths, realities, and future projections - Nov 5, 2018.
An overview of quantum computing and quantum algorithm design, including current state of the hardware and algorithm design within the existing systems.
Machine Learning, Python, Quantum Computing, Statistics
Top 13 Python Deep Learning Libraries - Nov 2, 2018.
Part 2 of a new series investigating the top Python Libraries across Machine Learning, AI, Deep Learning and Data Science.
Caffe, Deep Learning, GitHub, MXNet, Python, PyTorch, TensorFlow, Theano
- Multi-Class Text Classification Model Comparison and Selection - Nov 1, 2018.
This is what we are going to do today: use everything that we have presented about text classification in the previous articles (and more) and comparing between the text classification models we trained in order to choose the most accurate one for our problem.
Pages: 1 2
Modeling, NLP, Python, Text Classification
How Machines Understand Our Language: An Introduction to Natural Language Processing - Oct 31, 2018.
The applications of NLP are endless. This is how a machine classifies whether an email is spam or not, if a review is positive or negative, and how a search engine recognizes what type of person you are based on the content of your query to customize the response accordingly.
Machine Learning, NLP, NLTK, Python, Tokenization
- KDnuggets™ News 18:n41, Oct 31: Introduction to Deep Learning with Keras; Easy Named Entity Recognition with Scikit-Learn - Oct 31, 2018.
Also: Generative Adversarial Networks - Paper Reading Road Map; How I Learned to Stop Worrying and Love Uncertainty; Implementing Automated Machine Learning Systems with Open Source Tools; Notes on Feature Preprocessing: The What, the Why, and the How
Automated Machine Learning, Data Preprocessing, Deep Learning, Generative Adversarial Network, Keras, NLP, Python, scikit-learn
- Stop Installing Tensorflow Using pip for Performance Sake! - Oct 30, 2018.
If you aren’t already using conda, I recommend that you start as it makes managing your data science tools much more enjoyable.
Anaconda, Python, TensorFlow
- Introduction to Deep Learning with Keras - Oct 29, 2018.
In this article, we’ll build a simple neural network using Keras. Now let’s proceed to solve a real business problem: an insurance company wants you to develop a model to help them predict which claims look fraudulent.
Pages: 1 2
Deep Learning, Keras, Neural Networks, Python
SQL, Python, & R in One Platform - Oct 26, 2018.
No more jumping between applications. Mode Studio combines a SQL editor, Python and R notebooks, and a visualization builder in one platform.
Data Visualization, Mode Analytics, Python, R, SQL
- Notes on Feature Preprocessing: The What, the Why, and the How - Oct 26, 2018.
This article covers a few important points related to the preprocessing of numeric data, focusing on the scaling of feature values, and the broad question of dealing with outliers.
Data Preparation, Data Preprocessing, numpy, Python, scikit-learn, SciPy
- Naive Bayes from Scratch using Python only – No Fancy Frameworks - Oct 25, 2018.
We provide a complete step by step pythonic implementation of naive bayes, and by keeping in mind the mathematical & probabilistic difficulties we usually face when trying to dive deep in to the algorithmic insights of ML algorithms, this post should be ideal for beginners.
Pages: 1 2
Machine Learning, Naive Bayes, Python
- Get a 2–6x Speed-up on Your Data Pre-processing with Python - Oct 23, 2018.
Get a 2–6x speed-up on your pre-processing with these 3 lines of code!
Data Preprocessing, Efficiency, Programming, Python
- Beginner Data Visualization & Exploration Using Pandas - Oct 22, 2018.
This tutorial will offer a beginner guide into how to get around with Pandas for data wrangling and visualization.
Pages: 1 2
Data Exploration, Data Visualization, Pandas, Python
- Accelerating Your Algorithms in Production [Webinar Replay] - Oct 16, 2018.
Numerical algorithms are computationally demanding, which makes performance an important consideration when using Python for machine learning, especially as you move from desktop to production.
ActiveState, Intel, Production, Python
GitHub Python Data Science Spotlight: High Level Machine Learning & NLP, Ensembles, Command Line Viz & Docker Made Easy - Oct 16, 2018.
This post spotlights 5 data science projects, all of which are open source and are present on GitHub repositories, focusing on high level machine learning libraries and low level support tools.
Data Science, Docker, Ensemble Methods, fast.ai, GitHub, Machine Learning, NLP, Python
- SQL, Python, & R: All in One Platform - Oct 11, 2018.
Mode Studio connects a SQL editor, Python and R notebooks, and a visualization builder in one platform. Sign up now for access.
Data Visualization, Python, R, SQL
- Evaluating the Business Value of Predictive Models in Python and R - Oct 11, 2018.
In these blogs for R and python we explain four valuable evaluation plots to assess the business value of a predictive model. We show how you can easily create these plots and help you to explain your predictive model to non-techies.
Pages: 1 2
Business Value, Data Visualization, Lift charts, Predictive Models, Python, R
10 Best Mobile Apps for Data Scientist / Data Analysts - Oct 10, 2018.
A collection of useful mobile applications that will help enhance your vital data science and analytic skills. These free apps can improve your listening abilities, logical skills, basic leadership qualities and more.
Apps, Data Scientist, Mobile, Python
- KDnuggets™ News 18:n38, Oct 10: Concise Explanation of Learning Algorithms; Why I Call Myself a Data Scientist; Linear Regression in the Wild - Oct 10, 2018.
This week, KDnuggets brings you a discussion of learning algorithms with a hat tip to Tom Mitchell, discusses why you might call yourself a data scientist, explores machine learning in the wild, checks out some top trends in deep learning, shows you how to learn data science if you are low on finances, and puts forth one person's opinion on the top 8 Python machine learning libraries to help get the job done.
Algorithms, Data Science, Deep Learning, Linear Regression, Machine Learning, Python, Tom Mitchell
Top 8 Python Machine Learning Libraries - Oct 9, 2018.
Part 1 of a new series investigating the top Python Libraries across Machine Learning, AI, Deep Learning and Data Science.
GitHub, Keras, Machine Learning, Python
- Basic Image Data Analysis Using Python – Part 4 - Oct 5, 2018.
Accessing the internal component of digital images using Python packages helps the user understand its properties, as well as its nature.
Computer Vision, Image Processing, Python
- Linear Regression in the Wild - Oct 3, 2018.
We take a look at how to use linear regression when the dependent variables have measurement errors.
Algorithms, Linear Regression, Python
- Unleash a Faster Python on Your Data - Oct 2, 2018.
Intel provides optimized Scikit-learn, the most used Python package for classical machine learning. Get faster scikit-learn through Intel® Distribution for Python*
Analytics, Intel, Python, scikit-learn
How to Create a Simple Neural Network in Python - Oct 2, 2018.
The best way to understand how neural networks work is to create one yourself. This article will demonstrate how to do just that.
Machine Learning, Neural Networks, Python
- Basic Image Data Analysis Using Python – Part 3 - Sep 28, 2018.
Accessing the internal component of digital images using Python packages becomes more convenient to help understand its properties, as well as nature.
Computer Vision, Image Processing, numpy, Python
- Visualising Geospatial data with Python using Folium - Sep 27, 2018.
Folium is a powerful data visualization library in Python that was built primarily to help people visualize geospatial data. With Folium, one can create a map of any location in the world if its latitude and longitude values are known. This guide will help you get started.
Data Visualization, Geospatial, GitHub, Python
- Raspberry Pi IoT Projects for Fun and Profit - Sep 27, 2018.
In this post, I will explain how to run an IoT project from the command line, without graphical interface, using Ubuntu Core in a Raspberry Pi 3.
Pages: 1 2
Data Science, IoT, Python, Raspberry Pi
- Deep Learning Framework Power Scores 2018 - Sep 24, 2018.
Who’s on top in usage, interest, and popularity?
CNTK, Deep Learning, fast.ai, Java, Keras, MXNet, Python, PyTorch, TensorFlow, Theano
- Data Augmentation For Bounding Boxes: Rethinking image transforms for object detection - Sep 19, 2018.
Data Augmentation is one way to battle this shortage of data, by artificially augmenting our dataset. In fact, the technique has proven to be so successful that it's become a staple of deep learning systems.
Pages: 1 2
Deep Learning, Image Recognition, Neural Networks, Object Detection, Python
- Iterative Initial Centroid Search via Sampling for k-Means Clustering - Sep 12, 2018.
Thinking about ways to find a better set of initial centroid positions is a valid approach to optimizing the k-means clustering process. This post outlines just such an approach.
Clustering, K-means, Python, Sampling, scikit-learn
- Machine Learning for Text Classification Using SpaCy in Python - Sep 11, 2018.
In this post, we will demonstrate how text classification can be implemented using spaCy without having any deep learning experience.
NLP, Python, Text Analytics, Text Classification, Text Mining
- Training with Keras-MXNet on Amazon SageMaker - Sep 10, 2018.
In this post, you will learn how to train Keras-MXNet jobs on Amazon SageMaker. I’ll show you how to build custom Docker containers for CPU and GPU training, configure multi-GPU training, pass parameters to a Keras script, and save the trained models in Keras and MXNet formats.
Pages: 1 2
Docker, Keras, MXNet, Neural Networks, Python, Sagemaker

Journey to Machine Learning – 100 Days of ML Code - Sep 7, 2018.
A personal account from Machine Learning enthusiast Avik Jain on his experiences of #100DaysOfMLCode, a challenge that encourages beginners to code and study machine learning for at least an hour, every day for 100 days.
GitHub, K-nearest neighbors, Machine Learning, Python, SVM
Ultimate Guide to Getting Started with TensorFlow - Sep 6, 2018.
Including video and written tutorials, beginner code examples, useful tricks, helpful communities, books, jobs and more - this is the ultimate guide to getting started with TensorFlow.
Deep Learning, Dropout, Python, TensorFlow
- KDnuggets™ News 18:n33, Sep 5: Practical Topic Modeling with Python; Classifying AI Technologies; Data Science Project Inspiration - Sep 5, 2018.
Also: An End-to-End Project on Time Series Analysis and Forecasting with Python; Financial Data Analysis - Data Processing 1: Loan Eligibility Prediction; OLAP queries in SQL: A Refresher; Word Vectors in Natural Language Processing: Global Vectors (GloVe)
AI, Data Science, Finance, OLAP, Python, SQL, Time Series, Topic Modeling, Word Embeddings
- Financial Data Analysis – Data Processing 1: Loan Eligibility Prediction - Sep 4, 2018.
In this first part I show how to clean and remove unnecessary features. Data processing is very time-consuming, but better data would produce a better model.
Data Preprocessing, Data Processing, Finance, Python
- An End-to-End Project on Time Series Analysis and Forecasting with Python - Sep 3, 2018.
Time series are widely used for non-stationary data, like economic, weather, stock price, and retail sales in this post. We will demonstrate different approaches for forecasting retail sales time series.
Forecasting, Python, Time Series, Trend Detection
- Optimus v2: Agile Data Science Workflows Made Easy - Aug 30, 2018.
Looking for a library to skyrocket your productivity as Data Scientist? Check this out!
Apache Spark, Machine Learning, Pandas, Python
- Top KDnuggets tweets, Aug 22-28: AI Knowledge Map: How To Classify AI Technologies; 100 Days of #MachineLearning Coding with #Python - Aug 29, 2018.
Also 25 fun questions for a machine learning interview; Data Visualization Cheat Sheet
AI, Data Visualization, Interview Questions, Machine Learning, Python, Top tweets
- Deploying scikit-learn Models at Scale - Aug 29, 2018.
Find out how to serve your scikit-learn model in an auto-scaling, serverless environment! Today, we’ll take a trained scikit-learn model and deploy it on Cloud ML Engine.
Cloud, Google, Google Cloud, Machine Learning, Python, scikit-learn
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
NLP, Python, scikit-learn, Text Classification, Text Mining
- Analyze, engineer, design: Do it all with Dash - Aug 24, 2018.
Open-source Dash lets you wrap a GUI around that analytical code, without leaving the familiarity of Python. Explore your data with rich, interactive drop-down menus, sliders, and other components, all in the web browser.
Dash, Data Visualization, Open Source, Plotly, Python
- 9 Things You Should Know About TensorFlow - Aug 22, 2018.
A summary of the key points from the Google Cloud Next in San Francisco, "What’s New with TensorFlow?", including neural networks, TensorFlow Lite, data pipelines and more.
Deep Learning, Google, Keras, Machine Learning, Python, TensorFlow
- Kinetica: Software Engineer (Python) [Arlington, VA] - Aug 21, 2018.
Work closely with the Product Owner to build out the product in Python and integrate all other parts (TensorFlow, Kubernetes, and our GPU-powered DB) using Python bindings to build and deliver an overall product (a REST API).
Arlington, Database, GPU, Kinetica, Python, Software Engineer, VA
- Basic Statistics in Python: Probability - Aug 21, 2018.
At the most basic level, probability seeks to answer the question, "What is the chance of an event happening?" To calculate the chance of an event happening, we also need to consider all the other events that can occur.
Normal Distribution, Probability, Python, Statistics
- Why Automated Feature Engineering Will Change the Way You Do Machine Learning - Aug 20, 2018.
Automated feature engineering will save you time, build better predictive models, create meaningful features, and prevent data leakage.
Automated Machine Learning, Feature Engineering, Machine Learning, Python
- Introduction to Fraud Detection Systems - Aug 17, 2018.
Using the Python gradient boosting library LightGBM, this article introduces fraud detection systems, with code samples included to help you get started.
Fraud Detection, Gradient Boosting, Python
Auto-Keras, or How You can Create a Deep Learning Model in 4 Lines of Code - Aug 17, 2018.
Auto-Keras is an open source software library for automated machine learning. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models.
Automated Machine Learning, Keras, Neural Networks, Python
- Machine Learning with TensorFlow - Aug 16, 2018.
In this on-demand webinar, you’ll get a general introduction to working with Tensorflow and its surrounding ecosystem, general problem classes, where you can get big acceleration, and why you should be running on a CPU.
ActiveState, Intel, Machine Learning, Python, TensorFlow
- A Crash Course in MXNet Tensor Basics & Simple Automatic Differentiation - Aug 16, 2018.
This is an overview of some basic functionality of the MXNet ndarray package for creating tensor-like objects, and using the autograd package for performing automatic differentiation.
GPU, MXNet, Python, Tensor
- Top KDnuggets tweets, Aug 1-14: Basic Statistics in Python; Essential Command Line Tools for Data Scientists - Aug 15, 2018.
Basic Statistics in Python: Descriptive Statistics; Top 12 Essential Command Line Tools for Data Scientists; WTF is a Tensor?!?; How GOAT Taught a Machine to Love Sneakers;
Python, Statistics, Tensor, Top tweets
- An Introduction to t-SNE with Python Example - Aug 15, 2018.
In this post we’ll give an introduction to the exploratory and visualization t-SNE algorithm. t-SNE is a powerful dimension reduction and visualization technique used on high dimensional data.
Clustering, Data Visualization, PCA, Python, t-SNE
- KDnuggets™ News 18:n31, Aug 15: Top 10 roles in AI and data science; Github Data Science Spotlight: Python tools for Machine Learning - Aug 15, 2018.
Also: A Practitioner Guide to NLP; Reinforcement Learning: The Business Use Case; Data Scientist guide for getting started with Docker
Career, Data Science, Docker, GitHub, Jobs, Python
- Setting up your AI Dev Environment in 5 Minutes - Aug 13, 2018.
Whether you're a novice data science enthusiast setting up TensorFlow for the first time, or a seasoned AI engineer working with terabytes of data, getting your libraries, packages, and frameworks installed is always a struggle. Learn how datmo, an open source python package, helps you get started in minutes.
AI, datmo, Development, Docker, Machine Learning, Python, TensorFlow
- Optimization 101 for Data Scientists - Aug 8, 2018.
We show how to use optimization strategies to make the best possible decision.
Football, Julia, Optimization, Python, R, Sports
GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows - Aug 8, 2018.
This post includes a wide spectrum of data science projects, all of which are open source and are present on GitHub repositories.
Automated Machine Learning, Data Science, Data Visualization, GitHub, Keras, Machine Learning, MLflow, NLP, Python, Workflow
- KDnuggets™ News 18:n30, Aug 8: Iconic Data Visualisation; Data Scientist Interviews Demystified; Simple Statistics in Python - Aug 8, 2018.
Also: Selecting the Best Machine Learning Algorithm for Your Regression Problem; From Data to Viz: how to select the the right chart for your data; Only Numpy: Implementing GANs and Adam Optimizer using Numpy; Programming Best Practices for Data Science
Data Science, Data Visualization, Generative Adversarial Network, Interview, Machine Learning, numpy, Python, Regression, Statistics
Programming Best Practices For Data Science - Aug 7, 2018.
In this post, I'll go over the two mindsets most people switch between when doing programming work specifically for data science: the prototype mindset and the production mindset.
Best Practices, Data Science, Pandas, Programming, Python
Only Numpy: Implementing GANs and Adam Optimizer using Numpy - Aug 6, 2018.
This post is an implementation of GANs and the Adam optimizer using only Python and Numpy, with minimal focus on the underlying maths involved.
GANs, Generative Adversarial Network, Neural Networks, numpy, Optimization, Python
- WTF is TF-IDF? - Aug 2, 2018.
Relevant words are not necessarily the most frequent words since stopwords like “the”, “of” or “a” tend to occur very often in many documents.
Information Retrieval, Python, Text Analytics, Text Mining, TF-IDF
Basic Statistics in Python: Descriptive Statistics - Aug 1, 2018.
This article covers defining statistics, descriptive statistics, measures of central tendency, and measures of spread. This article assumes no prior knowledge of statistics, but does require at least a general knowledge of Python.
Descriptive Analytics, Python, Statistics
- Intuitive Ensemble Learning Guide with Gradient Boosting - Jul 30, 2018.
This tutorial discusses the importance of ensemble learning with gradient boosting as a study case.
Ensemble Methods, Gradient Boosting, Python
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
Jupyter, Machine Learning, Microsoft, Python, R, SQL, SQL Server
- KDnuggets™ News 18:n28, Jul 25: Best (and Free) Resources to Understand Deep Learning; Why Germany did not beat Brazil in the final – Data Science lessons from the World Cup - Jul 25, 2018.
Also 5 Quick and Easy Data Visualizations in Python with Code; Happy 25th Birthday, KDnuggets!
About KDnuggets, Data Visualization, Deep Learning, Python, World Cup
Genetic Algorithm Implementation in Python - Jul 24, 2018.
This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation.
Algorithms, Genetic Algorithm, Python
Cookiecutter Data Science: How to Organize Your Data Science Project - Jul 24, 2018.
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Data Science, Programming, Project, Python
- Improve Data Science Productivity with Anaconda Enterprise - Jul 23, 2018.
Anaconda Enterprise is the only product on the market that empowers your data science team to go from laptop to cluster to production with full reproducibility and governance.
Anaconda, Data Science, Enterprise, Python
Comparison of Top 6 Python NLP Libraries - Jul 23, 2018.
Today, we want to outline and compare the most popular and helpful natural language processing libraries, based on our experience.
NLP, Python
- Receiver Operating Characteristic Curves Demystified (in Python) - Jul 20, 2018.
In this blog, I will reveal, step by step, how to plot an ROC curve using Python. After that, I will explain the characteristics of a basic ROC curve.
Machine Learning, Metrics, Python, ROC-AUC
Explaining the 68-95-99.7 rule for a Normal Distribution - Jul 19, 2018.
This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors.
Data Analysis, Data Science, Normal Distribution, Python, Statistics
5 Quick and Easy Data Visualizations in Python with Code - Jul 18, 2018.
This post provides an overview of a small number of widely used data visualizations, and includes code in the form of functions to implement each in Python using Matplotlib.
Data Visualization, Matplotlib, Python
- Basic Image Processing in Python, Part 2 - Jul 17, 2018.
We explain how to easily access and manipulate the internal components of digital images using Python and give examples from satellite image processing.
Computer Vision, Image Processing, numpy, Python
- KDnuggets™ News 18:n26, Jul 11: 5 Favorite Free Visualization Tools; SQL Cheat Sheet; Top 20 Python Libraries for Data Science - Jul 11, 2018.
Also Introduction to Apache Spark; fast.ai Machine Learning Course Notes; Cartoon: How is Data Science Different From Religion?
Cheat Sheet, Data Visualization, Python, SQL
- Basic Image Data Analysis Using Numpy and OpenCV – Part 1 - Jul 10, 2018.
Accessing the internal component of digital images using Python packages becomes more convenient to understand its properties as well as nature.
Computer Vision, Image Processing, numpy, OpenCV, Python
Analyze a Soccer (Football) Game Using Tensorflow Object Detection and OpenCV - Jul 10, 2018.
For the data scientist within you let's use this opportunity to do some analysis on soccer clips. With the use of deep learning and opencv we can extract interesting insights from video clips
Football, Image Recognition, Object Detection, OpenCV, Python, Soccer, TensorFlow, Video recognition, World Cup
- Manage your Machine Learning Lifecycle with MLflow – Part 1 - Jul 5, 2018.
Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.
Databricks, Life Cycle, MLflow, Pipeline, Python, Workflow
- Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors - Jul 5, 2018.
In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.
Convolutional Neural Networks, Keras, LSTM, NLP, Python, Text Classification, Word Embeddings
- Deep Quantile Regression - Jul 3, 2018.
Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.
Deep Learning, Hyperparameter, Keras, Neural Networks, Python, Regression
- Inside the Mind of a Neural Network with Interactive Code in Tensorflow - Jun 29, 2018.
Understand the inner workings of neural network models as this post covers three related topics: histogram of weights, visualizing the activation of neurons, and interior / integral gradients.
Pages: 1 2
Convolutional Neural Networks, Image Recognition, Neural Networks, Python, TensorFlow
- Building a Basic Keras Neural Network Sequential Model - Jun 29, 2018.
The approach basically coincides with Chollet's Keras 4 step workflow, which he outlines in his book "Deep Learning with Python," using the MNIST dataset, and the model built is a Sequential network of Dense layers. A building block for additional posts.
Keras, MNIST, Neural Networks, Python
Top 20 Python Libraries for Data Science in 2018 - Jun 27, 2018.
Our selection actually contains more than 20 libraries, as some of them are alternatives to each other and solve the same problem. Therefore we have grouped them as it's difficult to distinguish one particular leader at the moment.
Pages: 1 2
Bokeh, Data Science, Keras, Matplotlib, NLTK, numpy, Pandas, Plotly, Python, PyTorch, scikit-learn, SciPy, Seaborn, TensorFlow, XGBoost
- How to Execute R and Python in SQL Server with Machine Learning Services - Jun 25, 2018.
Machine Learning Services in SQL Server eliminates the need for data movement - you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
Azure ML, Machine Learning, Microsoft, Python, R, SQL, SQL Server
- Get Packt Skill Up Developer Skills Report - Jun 19, 2018.
Find the top tools for 4 distinct industries, learn what do developers in different sectors say is the next big thing, and more. Also get any Packt book or video for just $10.
Developers, Free ebook, Machine Learning, Packt Publishing, Python
- Choosing the Right Metric for Evaluating Machine Learning Models — Part 2 - Jun 19, 2018.
This will focus on commonly used metrics in classification, why should we prefer some over others with context.
Classification, Machine Learning, Metrics, Python, ROC-AUC
- Step Forward Feature Selection: A Practical Example in Python - Jun 18, 2018.
When it comes to disciplined approaches to feature selection, wrapper methods are those which marry the feature selection process to the type of model being built, evaluating feature subsets in order to detect the model performance between features, and subsequently select the best performing subset.
Feature Selection, Machine Learning, Python
Generating Text with RNNs in 4 Lines of Code - Jun 14, 2018.
Want to generate text with little trouble, and without building and tuning a neural network yourself? Let's check out a project which allows you to "easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code."
Donald Trump, LSTM, NLP, Python, Recurrent Neural Networks, Twitter
- KDnuggets™ News 18:n23, Jun 13: Did Python declare victory over R?; Master the Netflix Interview; Deep Learning Projects DIY Style - Jun 13, 2018.
Also: Command Line Tricks For Data Scientists; How (dis)similar are my train and test data?; 5 Machine Learning Projects You Should Not Overlook, June 2018; Introduction to Game Theory; Human Interpretable Machine Learning
Data Science, Deep Learning, Interview Questions, Machine Learning, Netflix, Python, R, Training Data
- Empowering National Grid with Anaconda Enterprise - Jun 11, 2018.
With Anaconda Enterprise, National Grid was able to implement a more informed and cost-effective system that allowed for greater accuracy in modeling and predicting maintenance needs. Read the case study to learn more.
Anaconda, Electricity, Enterprise, National Grid, Python
- Packaging and Distributing Your Python Project to PyPI for Installation Using pip - Jun 11, 2018.
This tutorial will explain the steps required to package your Python projects, distribute them in distribution formats using steptools, upload them into the Python Package Index (PyPI) repository using twine, and finally installation using Python installers such as pip and conda.
Pages: 1 2
Distribution, Project, Python
DIY Deep Learning Projects - Jun 8, 2018.
Inspired by the great work of Akshay Bahadur in this article you will see some projects applying Computer Vision and Deep Learning, with implementations and details so you can reproduce them on your computer.
Computer Vision, Data Science, Deep Learning, LinkedIn, Neural Networks, OpenCV, Python
The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
Anaconda, Apache Spark, Data Science, Keras, Machine Learning, Open Source, Poll, Python, R, RapidMiner, Scala, scikit-learn, TensorFlow
- ioModel Machine Learning Research Platform – Open Source - Jun 5, 2018.
This article introduces ioModel, an open source research platform that ingests data and automatically generates descriptive statistics on that data.
Data Preparation, GitHub, Machine Learning, Open Source, Postgres, Python
The Keras 4 Step Workflow - Jun 4, 2018.
In his book "Deep Learning with Python," Francois Chollet outlines a process for developing neural networks with Keras in 4 steps. Let's take a look at this process with a simple example.
Francois Chollet, Keras, Neural Networks, Python, Workflow
- Overview of Dash Python Framework from Plotly for building dashboards - May 31, 2018.
Introduction to Dash framework from Plotly, reactive framework for building dashboards in Python. Tech talk covers basics and more advanced topics like custom component and scaling.
Dashboard, Data Analytics, Data Visualization, Plotly, Python
- NLP in Online Courses: an Overview - May 28, 2018.
This article examines several Natural Language Processing (NLP) courses across a variety of online sources and programming languages.
Coursera, edX, NLP, NLTK, Online Education, Python, Sciforce, Udemy
- Learn AI and Data Science rapidly based only on high school math – KDnuggets Offer - May 25, 2018.
This 3-month program, created by Ajit Jaokar, who teaches at Oxford, is interactive and delivered by video. Coding examples are in Python. Places limited - check special KDnuggets rate.
AI, Ajit Jaokar, Data Science Education, Mathematics, Online Education, Python
- KDnuggets™ News 18:n21, May 23: Python eats away at R; Top 2018 Analytics, Data Science, Machine Learning tools; 9 Must-have skills for a Data Scientist - May 23, 2018.
Also How to Implement a YOLO (v3) Object Detector from Scratch in PyTorch; Frameworks for Approaching the Machine Learning Process.
Data Science Platform, Data Science Skills, Poll, Python, PyTorch
Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis - May 22, 2018.
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Pages: 1 2
Anaconda, Data Mining Software, Data Science Platform, Hadoop, Keras, Poll, Python, R, RapidMiner, SQL, TensorFlow, Trends
- Top Stories, May 14-20: Data Science vs Machine Learning vs Data Analytics vs Business Analytics; Implement a YOLO Object Detector from Scratch in PyTorch - May 21, 2018.
Also: An Introduction to Deep Learning for Tabular Data; 9 Must-have skills you need to become a Data Scientist, updated; GANs in TensorFlow from the Command Line: Creating Your First GitHub Project; Complete Guide to Build ConvNet HTTP-Based Application
Convolutional Neural Networks, Deep Learning, Image Recognition, Neural Networks, Python, PyTorch, TensorFlow, Top stories
- Kernel Machine Learning (KernelML) - Generalized Machine Learning Algorithm - May 18, 2018.
This article introduces a pip Python package called KernelML, created to give analysts and data scientists a generalized machine learning algorithm for complex loss functions and non-linear coefficients.
Clustering, Machine Learning, Python
How to Implement a YOLO (v3) Object Detector from Scratch in PyTorch: Part 1 - May 17, 2018.
The best way to go about learning object detection is to implement the algorithms by yourself, from scratch. This is exactly what we'll do in this tutorial.
Computer Vision, Image Recognition, Neural Networks, Object Detection, Python, PyTorch, YOLO
- GANs in TensorFlow from the Command Line: Creating Your First GitHub Project - May 16, 2018.
In this article I will present the steps to create your first GitHub Project. I will use as an example Generative Adversarial Networks.
GANs, Generative Adversarial Network, GitHub, Neural Networks, Python, Rubens Zimbres, TensorFlow
- KDnuggets™ News 18:n20, May 16: PyTorch Tensor Basics; Data Science in Finance; Executive Guide to Data Science - May 16, 2018.
PyTorch Tensor Basics; Top 7 Data Science Use Cases in Finance; The Executive Guide to Data Science and Machine Learning; Data Augmentation: How to use Deep Learning when you have Limited Data
Computer Vision, Data Science, Deep Learning, Finance, Neural Networks, Python, PyTorch, Tensor, Wikidata
Complete Guide to Build ConvNet HTTP-Based Application using TensorFlow and Flask RESTful Python API - May 15, 2018.
In this tutorial, a CNN is to be built, and trained and tested against the CIFAR10 dataset. To make the model remotely accessible, a Flask Web application is created using Python to receive an uploaded image and return its classification label using HTTP.
Pages: 1 2
API, Convolutional Neural Networks, Dropout, Flask, Neural Networks, Python, RESTful API, TensorFlow
- Simple Derivatives with PyTorch - May 14, 2018.
PyTorch includes an automatic differentiation package, autograd, which does the heavy lifting for finding derivatives. This post explores simple derivatives using autograd, outside of neural networks.
Python, PyTorch
PyTorch Tensor Basics - May 11, 2018.
This is an introduction to PyTorch's Tensor class, which is reasonably analogous to Numpy's ndarray, and which forms the basis for building neural networks in PyTorch.
GPU, Python, PyTorch, Tensor
- Unleash a faster Python on Your Data. - May 10, 2018.
Get real performance results and download the free Intel(r) Distribution for Python that includes everything you need for blazing-fast computing, analytics, machine learning, and more.
free download, Intel, numpy, Python