- Intent Recognition with BERT using Keras and TensorFlow 2 - Feb 10, 2020.
TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!
BERT, Keras, NLP, Python, TensorFlow
- Getting up and Running with Python: Installing Anaconda on Windows - Feb 6, 2020.
This tutorial covers how to download and install Anaconda on Windows; how to test your installation; how to fix common installation issues; and what to do after installing Anaconda.
Anaconda, Python
- Create Your Own Computer Vision Sandbox - Feb 5, 2020.
This post covers a wide array of computer vision tasks, from automated data collection to CNN model building.
Computer Vision, Convolutional Neural Networks, Python
- Audio File Processing: ECG Audio Using Python - Feb 4, 2020.
In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.
Audio, Data Processing, Health, Python
How to Optimize Your Jupyter Notebook - Jan 30, 2020.
This article walks through some simple tricks on improving your Jupyter Notebook experience, and covers useful shortcuts, adding themes, automatically generated table of contents, and more.
Jupyter, Optimization, Python
- Generating English Pronoun Questions Using Neural Coreference Resolution - Jan 29, 2020.
This post will introduce a practical method for generating English pronoun questions from any story or article. Learn how to take an additional step toward computationally understanding language.
NLP, Python, spaCy, Text Analytics
- Exoplanet Hunting Using Machine Learning - Jan 28, 2020.
Search for exoplanets — those planets beyond our own solar system — using machine learning, and implement these searches in Python.
Cosmology, Machine Learning, Python
- The 5 Most Useful Techniques to Handle Imbalanced Datasets - Jan 22, 2020.
This post is about explaining the various techniques you can use to handle imbalanced datasets.
Balancing Classes, Datasets, Metrics, Python, Sampling, Unbalanced
- Random Forest® — A Powerful Ensemble Learning Algorithm - Jan 22, 2020.
The article explains the Random Forest algorithm and how to build and optimize a Random Forest classifier.
Algorithms, Ensemble Methods, Python, random forests algorithm
- Geovisualization with Open Data - Jan 15, 2020.
In this post I want to show how to use public available (open) data to create geo visualizations in python. Maps are a great way to communicate and compare information when working with geolocation data. There are many frameworks to plot maps, here I focus on matplotlib and geopandas (and give a glimpse of mplleaflet).
Germany, Maps, Open Data, Python, Visualization
- KDnuggets™ News 20:n02, Jan 15: Top 5 Must-have Data Science Skills; Learn Machine Learning with THIS Book - Jan 15, 2020.
This week: learn the 5 must-have data science skills for the new year; find out which book is THE book to get started learning machine learning; pick up some Python tips and tricks; learn SQL, but learn it the hard way; and find an introductory guide to learning common NLP techniques.
Books, Data Science, Data Science Skills, Machine Learning, NLP, Programming, Python, SQL, Tips
- KDnuggets™ News 20:n01, Jan 8: How to “Ultralearn” Data Science; How teams do AutoML? - Jan 8, 2020.
First issue of 2020 brings you a summary of how to "Ultralearn" Data Science - for those in a hurry; Explains how teams work on AutoML project; Why Python is a preferred language for Data Science; and a cartoon on teaching ethics to AI.
AutoML, Data Science Team, Python, Ultralearn
10 Python Tips and Tricks You Should Learn Today - Jan 8, 2020.
Check out this collection of 10 Python snippets that can be taken as a reference for your daily work.
Programming, Python, Tips
- H2O Framework for Machine Learning - Jan 6, 2020.
This article is an overview of H2O, a scalable and fast open-source platform for machine learning. We will apply it to perform classification tasks.
Automated Machine Learning, AutoML, H2O, Machine Learning, Python
- How to Convert a Picture to Numbers - Jan 6, 2020.
Reducing images to numbers makes them amenable to computation. Let's take a look at the why and the how using Python and Numpy.
Computer Vision, Image Processing, numpy, Python
- Why Python is One of the Most Preferred Languages for Data Science? - Jan 3, 2020.
Why do most data scientists love Python? Learn more about how so many well-developed Python packages can help you accomplish your crucial data science tasks.
Data Exploration, Data Science, Programming Languages, Python
Predict Electricity Consumption Using Time Series Analysis - Jan 2, 2020.
Time series forecasting is a technique for the prediction of events through a sequence of time. In this post, we will be taking a small forecasting problem and try to solve it till the end learning time series forecasting alongside.
ARIMA, Electricity, Python, Time Series
- Top KDnuggets tweets, Dec 18-30: A Gentle Introduction to Math Behind Neural Networks - Dec 31, 2019.
A Gentle Introduction to #Math Behind #NeuralNetworks; Learn How to Quickly Create UIs in Python; I wanna be a data scientist, but... how!?; I created my own deepfake in two weeks
Career, Deepfakes, Mathematics, Neural Networks, Python, Top tweets
- Fighting Overfitting in Deep Learning - Dec 27, 2019.
This post outlines an attack plan for fighting overfitting in neural networks.
Deep Learning, Keras, Neural Networks, Overfitting, Python, Regularization, Transfer Learning
- Market Basket Analysis: A Tutorial - Dec 24, 2019.
This article is about Market Basket Analysis & the Apriori algorithm that works behind it.
Apriori, Association Rules, Data Mining, Python
- KDnuggets™ News 19:n48, Dec 18: Build Pipelines with Pandas Using pdpipe; AI, Analytics, ML, DS, Technology Main Developments, Key Trends; Poll on AutoML - Dec 18, 2019.
Build Pipelines with Pandas Using pdpipe; AI, Analytics, ML, DS, Technology Main Developments, Key Trends; New Poll: Does AutoML work? Ultralearn Data Science; Python Dictionary How-To; Top stories of 2019 and more.
2020 Predictions, Data Science Education, Pandas, Python
- Pedestrian Detection Using Non Maximum Suppression Algorithm - Dec 17, 2019.
Read this overview of a complete pipeline for detecting pedestrians on the road.
Computer Vision, Image Recognition, Object Detection, Python
- Let’s Build an Intelligent Chatbot - Dec 17, 2019.
Check out this step by step approach to building an intelligent chatbot in Python.
Chatbot, NLP, NLTK, Python
Build Pipelines with Pandas Using pdpipe - Dec 13, 2019.
We show how to build intuitive and useful pipelines with Pandas DataFrame using a wonderful little library called pdpipe.
Data Preparation, Data Preprocessing, Pandas, Pipeline, Python
Plotnine: Python Alternative to ggplot2 - Dec 12, 2019.
Python's plotting libraries such as matplotlib and seaborn does allow the user to create elegant graphics as well, but lack of a standardized syntax for implementing the grammar of graphics compared to the simple, readable and layering approach of ggplot2 in R makes it more difficult to implement in Python.
Data Science, Data Visualization, Python, R
- Python Dictionary Guide: 10 Python Dictionary Methods & Examples - Dec 12, 2019.
Master Python Dictionaries and their essential functions in 15 minutes with this introductory guide.
Programming, Python
- Top KDnuggets tweets, Dec 04-10: AI, Analytics, Machine Learning, Data Science, Deep Learning Research Main Developments in 2019 and Key Trends for 2020 - Dec 11, 2019.
AI, Analytics, Machine Learning, Data Science, Deep Learning Research Main Developments and Key Trends; Down with technical debt! Clean #Python for #DataScientists; Calculate Similarity - the most relevant Metrics in a Nutshell.
2020 Predictions, Metrics, Python, Similarity, Top tweets
- Interpretability: Cracking open the black box, Part 2 - Dec 11, 2019.
The second part in a series on leveraging techniques to take a look inside the black box of AI, this guide considers post-hoc interpretation that is useful when the model is not transparent.
Explainability, Explainable AI, Feature Selection, Interpretability, Python
- 5 Great New Features in Latest Scikit-learn Release - Dec 10, 2019.
From not sweating missing values, to determining feature importance for any estimator, to support for stacking, and a new plotting API, here are 5 new features of the latest release of Scikit-learn which deserve your attention.
Data Preparation, Data Preprocessing, Ensemble Methods, Feature Selection, Gradient Boosting, K-nearest neighbors, Machine Learning, Missing Values, Python, scikit-learn, Visualization
10 Free Top Notch Machine Learning Courses - Dec 6, 2019.
Are you interested in studying machine learning over the holidays? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to improving your machine learning skills.
Books, Computer Vision, Courses, Deep Learning, Explainability, Graph Analytics, Interpretability, Machine Learning, NLP, Python
- Lit BERT: NLP Transfer Learning In 3 Steps - Nov 29, 2019.
PyTorch Lightning is a lightweight framework which allows anyone using PyTorch to scale deep learning code easily while making it reproducible. In this tutorial we’ll use Huggingface's implementation of BERT to do a finetuning task in Lightning.
BERT, NLP, Python, PyTorch Lightning, Transfer Learning
Open Source Projects by Google, Uber and Facebook for Data Science and AI - Nov 28, 2019.
Open source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public.
Advice, AI, Data Science, Data Scientist, Data Visualization, Deep Learning, Facebook, Google, Open Source, Python, Uber
- KDnuggets™ News 19:n45, Nov 27: Interpretable vs black box models; Advice for New and Junior Data Scientists - Nov 27, 2019.
This week: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead; Advice for New and Junior Data Scientists; Python Tuples and Tuple Methods; Can Neural Networks Develop Attention? Google Thinks they Can; Three Methods of Data Pre-Processing for Text Classification
Advice, Attention, Data Scientist, Machine Learning, Modeling, Neural Networks, NLP, Programming, Python, Text Classification
- Content-based Recommender Using Natural Language Processing (NLP) - Nov 26, 2019.
A guide to build a content-based movie recommender model based on NLP.
Movies, Netflix, NLP, Python, Recommender Systems
Automated Machine Learning Project Implementation Complexities - Nov 22, 2019.
To demonstrate the implementation complexity differences along the AutoML highway, let's have a look at how 3 specific software projects approach the implementation of just such an AutoML "solution," namely Keras Tuner, AutoKeras, and automl-gs.
Automated Machine Learning, Keras, Pipeline, Python
Python, Selenium & Google for Geocoding Automation: Free and Paid - Nov 21, 2019.
This tutorial will take you through two options that have automated the geocoding process for the user using Python, Selenium and Google Geocoding API.
Automation, Geocode, Geoscience, Geospatial, Google, Python, Selenium, Web Scraping
- The Notebook Anti-Pattern - Nov 21, 2019.
This article aims to explain why this drive towards the use of notebooks in production is an anti pattern, giving some suggestions along the way.
Jupyter, Python
- Python Tuples and Tuple Methods - Nov 21, 2019.
Brush up on your Python basics with this post on creating, using, and manipulating tuples.
Programming, Python
Data Science for Managers: Programming Languages - Nov 19, 2019.
In this article, we are going to talk about popular languages for Data Science and briefly describe each of them.
Data Science, Manager, MATLAB, Octave, Programming Languages, Python, R, Scala
- GitHub Repo Raider and the Automation of Machine Learning - Nov 18, 2019.
Since X never, ever marks the spot, this article raids the GitHub repos in search of quality automated machine learning resources. Read on for projects and papers to help understand and implement AutoML.
Automated Machine Learning, GitHub, Machine Learning, Movies, Python
- Python Lists and List Manipulation - Nov 15, 2019.
In Python, lists store an ordered collection of items which can be of different types. This post is an overview of lists and their manipulation.
Programming, Python
- How to Visualize Data in Python (and R) - Nov 14, 2019.
Producing accessible data visualizations is a key data science skill. The following guidelines will help you create the best representations of your data using R and Python's Pandas library.
Data Visualization, Matplotlib, Python, R, SuperDataScience
- Testing Your Machine Learning Pipelines - Nov 14, 2019.
Let’s take a look at traditional testing methodologies and how we can apply these to our data/ML pipelines.
Machine Learning, Pipeline, Python
- Python Workout / Practices of a Python Pro / Classic Computer Science Problems in Python - Nov 13, 2019.
Whether you’re a beginner or an expert, there’s always new ways you can improve your Python coding. Save 40% off this trio of Manning Python books today! Just enter the code nlpropython40 at checkout when you buy from manning.com.
Book, Manning, Python
- Beginners Guide to the Three Types of Machine Learning - Nov 13, 2019.
The following article is an introduction to classification and regression — which are known as supervised learning — and unsupervised learning — which in the context of machine learning applications often refers to clustering — and will include a walkthrough in the popular python library scikit-learn.
Beginners, Classification, Machine Learning, Python, Regression, scikit-learn, Supervised Learning, Unsupervised Learning
- KDnuggets™ News 19:n43, Nov 13: Dynamic Reports in Python and R; Creating NLP Vocabularies; What is Data Science? - Nov 13, 2019.
On KDnuggets this week: Orchestrating Dynamic Reports in Python and R with Rmd Files; How to Create a Vocabulary for NLP Tasks in Python; What is Data Science?; The Complete Data Science LinkedIn Profile Guide; Set Operations Applied to Pandas DataFrames; and much, much more.
Data Science, Deep Learning, Facebook, LinkedIn, NLP, Pandas, Python, R, Reinforcement Learning, Report

How to Speed up Pandas by 4x with one line of code - Nov 12, 2019.
While Pandas is the library for data processing in Python, it isn't really built for speed. Learn more about the new library, Modin, developed to distribute Pandas' computation to speedup your data prep.
Data Preparation, Data Preprocessing, Modin, Pandas, Python
Understanding Boxplots - Nov 8, 2019.
A boxplot. It can tell you about your outliers and what their values are. It can also tell you if your data is symmetrical, how tightly your data is grouped, and if and how your data is skewed.
Data Visualization, Matplotlib, Pandas, Python, Seaborn
- Orchestrating Dynamic Reports in Python and R with Rmd Files - Nov 8, 2019.
Do you want to extract csv files with Python and visualize them in R? How does preparing everything in R and make conclusions with Python sound? Both are possible if you know the right libraries and techniques. Here, we’ll walk through a use-case using both languages in one analysis
Python, R, Report
- Data Cleaning and Preprocessing for Beginners - Nov 7, 2019.
Careful preprocessing of data for your machine learning project is crucial. This overview describes the process of data cleaning and dealing with noise and missing data.
Beginners, Data Cleaning, Data Preprocessing, Pandas, Python, Sciforce
- Set Operations Applied to Pandas DataFrames - Nov 7, 2019.
In this tutorial, we show how to apply mathematical set operations (union, intersection, and difference) to Pandas DataFrames with the goal of easing the task of comparing the rows of two datasets.
Data Preparation, Data Science, Pandas, Python
- How to Create a Vocabulary for NLP Tasks in Python - Nov 7, 2019.
This post will walkthrough a Python implementation of a vocabulary class for storing processed text data and related metadata in a manner useful for subsequently performing NLP tasks.
Data Preparation, Data Preprocessing, NLP, Python
- Customer Segmentation Using K Means Clustering - Nov 4, 2019.
Customer Segmentation can be a powerful means to identify unsatisfied customer needs. This technique can be used by companies to outperform the competition by developing uniquely appealing products and services.
Clustering, Customer Analytics, K-means, Python, Segmentation
- Build an Artificial Neural Network From Scratch: Part 1 - Nov 1, 2019.
This article focused on building an Artificial Neural Network using the Numpy Python library.
Neural Networks, numpy, Python
- How to Build Your Own Logistic Regression Model in Python - Oct 31, 2019.
A hands on guide to Logistic Regression for aspiring data scientist and machine learning engineer.
Logistic Regression, Machine Learning, Python
- KDnuggets™ News 19:n41, Oct 30: Feature Selection: Beyond feature importance?; Time Series Analysis Using KNIME and Spark - Oct 30, 2019.
This week in KDnuggets: Feature Selection: Beyond feature importance?; Time Series Analysis: A Simple Example with KNIME and Spark; 5 Advanced Features of Pandas and How to Use Them; How to Measure Foot Traffic Using Data Analytics; Introduction to Natural Language Processing (NLP); and much, much more!
Apache Spark, Data Analytics, Feature Selection, Knime, NLP, Pandas, Python, scikit-learn, Time Series
- How to Extend Scikit-learn and Bring Sanity to Your Machine Learning Workflow - Oct 29, 2019.
In this post, learn how to extend Scikit-learn code to make your experiments easier to maintain and reproduce.
Machine Learning, Python, scikit-learn, Software Engineering, Workflow
- 5 Advanced Features of Pandas and How to Use Them - Oct 25, 2019.
The pandas library offers core functionality when preparing your data using Python. But, many don't go beyond the basics, so learn about these lesser-known advanced methods that will make handling your data easier and cleaner.
Data Preparation, Pandas, Python
- Convolutional Neural Network for Breast Cancer Classification - Oct 24, 2019.
See how Deep Learning can help in solving one of the most commonly diagnosed cancer in women.
Cancer Detection, Deep Learning, Healthcare, Python
- How to Write Web Apps Using Simple Python for Data Scientists - Oct 22, 2019.
Convert your Data Science Projects into cool apps easily without knowing any web frameworks.
Apps, Data Science, Data Scientist, Python
- Writing Your First Neural Net in Less Than 30 Lines of Code with Keras - Oct 18, 2019.
Read this quick overview of neural networks and learn how to implement your first in very few lines using Keras.
Keras, Neural Networks, Python
- How to Easily Deploy Machine Learning Models Using Flask - Oct 17, 2019.
This post aims to make you get started with putting your trained machine learning models into production using Flask API.
Deployment, Flask, Machine Learning, Python
- The 5 Classification Evaluation Metrics Every Data Scientist Must Know - Oct 16, 2019.
This post is about various evaluation metrics and how and when to use them.
Data Scientist, Machine Learning, Metrics, Python
Activation maps for deep learning models in a few lines of code - Oct 10, 2019.
We illustrate how to show the activation maps of various layers in a deep CNN model with just a couple of lines of code.
Architecture, Deep Learning, Neural Networks, Python
- Top KDnuggets tweets, Oct 02-08: Turn #Python Scripts into Beautiful ML Tools – with Streamlit, an app framework built for #MachineLearning engineers - Oct 9, 2019.
Also: 12 things I wish I'd known before starting as a Data Scientist; 10 Free Top Notch Natural Language Processing Courses; The Last SQL Guide for Data Analysis; The 4 Quadrants of #DataScience Skills and 7 Principles for Creating a Viral DataViz.
Machine Learning Engineer, Python, Streamlit, Top tweets
- Contributing to PyTorch: By someone who doesn’t know a ton about PyTorch - Oct 9, 2019.
By the end of my week with the team, I managed to proudly cut two PRs on GitHub. I decided that I would write a blog post to knowledge share, not just to show that YES, you can too.
Open Source, Python, PyTorch
The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral Data Visualization - Oct 7, 2019.
As a data scientist, your most important skill is creating meaningful visualizations to disseminate knowledge and impact your organization or client. These seven principals will guide you toward developing charts with clarity, as exemplified with data from a recent KDnuggets poll.
Data Science, Data Science Skills, Data Visualization, Excel, Java, Python, Skills, TensorFlow
- KDnuggets™ News 19:n37, Oct 2: The Future of Analytics & Data Science! Starting NLP with spaCy & Python - Oct 2, 2019.
This week, find out what the future of analytics and data science holds; get an introduction to spaCy for natural language processing; find out how to use time series analysis for baseball; get to know your data; read 6 bits of advice for data scientists; and much, much more!
Analytics, Baseball, Data Science, Emotion, NLP, Python, Sentiment Analysis, spaCy, Time Series
- What is Hierarchical Clustering? - Sep 27, 2019.
The article contains a brief introduction to various concepts related to Hierarchical clustering algorithm.
Clustering, Machine Learning, Python
- Natural Language in Python using spaCy: An Introduction - Sep 26, 2019.
This article provides a brief introduction to working with natural language (sometimes called “text analytics”) in Python using spaCy and related libraries.
NLP, Paco Nathan, Python, spaCy
- KDnuggets™ News 19:n36, Sep 25: The Hidden Risk of AI and Big Data; The 5 Sampling Algorithms every Data Scientist needs to know - Sep 25, 2019.
Learn about unexpected risk of AI applied to Big Data; Study 5 Sampling Algorithms every Data Scientist needs to know; Read how one data scientist copes with his boring days of deploying machine learning; 5 beginner-friendly steps to learn ML with Python; and more.
AI, Algorithms, Beginners, Python, Risks, Sampling
- A Single Function to Streamline Image Classification with Keras - Sep 23, 2019.
We show, step-by-step, how to construct a single, generalized, utility function to pull images automatically from a directory and train a convolutional neural net model.
Image Classification, Image Recognition, Keras, Python
- A Gentle Introduction to PyTorch 1.2 - Sep 20, 2019.
This comprehensive tutorial aims to introduce the fundamentals of PyTorch building blocks for training neural networks.
Neural Networks, Python, PyTorch
- Applying Data Science to Cybersecurity Network Attacks & Events - Sep 19, 2019.
Check out this detailed tutorial on applying data science to the cybersecurity domain, written by an individual with backgrounds in both fields.
Cybersecurity, Data Science, Machine Learning, Python, Security
- 5 Beginner Friendly Steps to Learn Machine Learning and Data Science with Python - Sep 19, 2019.
“I want to learn machine learning and artificial intelligence, where do I start?” Here.
Beginners, Data Science, Machine Learning, Python
- Python 2 End of Life Survey – Are You Prepared? - Sep 18, 2019.
Support for Python 2 will expire on Jan. 1, 2020, after which the Python core language and many third-party packages will no longer be supported or maintained. Take this survey to help determine and share your level of preparation.
ActiveState, Python, Survey
Which Data Science Skills are core and which are hot/emerging ones? - Sep 17, 2019.
We identify two main groups of Data Science skills: A: 13 core, stable skills that most respondents have and B: a group of hot, emerging skills that most do not have (yet) but want to add. See our detailed analysis.
Career, Data Science Skills, Data Visualization, Deep Learning, Excel, Machine Learning, Poll, Python, PyTorch, Scala, Skills, Statistics, TensorFlow
Explore the world of Bioinformatics with Machine Learning - Sep 17, 2019.
The article contains a brief introduction of Bioinformatics and how a machine learning classification algorithm can be used to classify the type of cancer in each patient by their gene expressions.
Bioinformatics, Machine Learning, Python
- 5 Step Guide to Scalable Deep Learning Pipelines with d6tflow - Sep 16, 2019.
How to turn a typical pytorch script into a scalable d6tflow DAG for faster research & development.
Deep Learning, Pipeline, Python, PyTorch, Workflow
- Ensemble Methods for Machine Learning: AdaBoost - Sep 12, 2019.
It turned out that, if we ask the weak algorithm to create a whole bunch of classifiers (all weak for definition), and then combine them all, what may figure out is a stronger classifier.
Adaboost, Ensemble Methods, Machine Learning, Python
Train sklearn 100x Faster - Sep 11, 2019.
As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution.
Distributed Systems, Machine Learning, Python, scikit-learn, Training
- KDnuggets™ News 19:n34, Sep 11: I wasn’t getting hired as a Data Scientist. So I sought data on who is - Sep 11, 2019.
How one person overcame rejections applying to Data Scientist positions by getting actual data on who is getting hired; Advice from Andrew Ng on building ML career and reading research papers; 10 Great Python resources for Data Scientists; Python Libraries for Interpretable ML.
Advice, Andrew Ng, Career, Data Scientist, Interpretability, Python
The 5 Graph Algorithms That Data Scientists Should Know - Sep 10, 2019.
In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python.
Algorithms, Data Science, Data Scientist, Graph, Python
- OpenStreetMap Data to ML Training Labels for Object Detection - Sep 9, 2019.
I am really interested in creating a tight, clean pipeline for disaster relief applications, where we can use something like crowd sourced building polygons from OSM to train a supervised object detector to discover buildings in an unmapped location.
Geospatial, Machine Learning, Object Detection, Python

10 Great Python Resources for Aspiring Data Scientists - Sep 9, 2019.
This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey.
Data Science, Data Scientist, Programming, Python
- Build Your First Voice Assistant - Sep 6, 2019.
Hone your practical speech recognition application skills with this overview of building a voice assistant using Python.
Machine Learning, NLP, Python, Speech Recognition
- Learn Quantum Computing with Python and Q#, Get Programming with Python, Data Science with Python and Dask - Sep 4, 2019.
Save 40% on Get Programming with Python, Data Science with Python and Dask, and Learn Quantum Computing with Python and Q# with code nlpython40.
Dask, Manning, Programming, Python, Quantum Computing
- An Easy Introduction to Machine Learning Recommender Systems - Sep 4, 2019.
Recommender systems are an important class of machine learning algorithms that offer "relevant" suggestions to users. Categorized as either collaborative filtering or a content-based system, check out how these approaches work along with implementations to follow from example code.
Beginners, Machine Learning, Python, Recommendation Engine, Recommender Systems
Python Libraries for Interpretable Machine Learning - Sep 4, 2019.
In the following post, I am going to give a brief guide to four of the most established packages for interpreting and explaining machine learning models.
Bias, Interpretability, LIME, Machine Learning, Python, SHAP
- An Overview of Topics Extraction in Python with Latent Dirichlet Allocation - Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
LDA, NLP, Python, Text Analytics, Topic Modeling
- Automate your Python Scripts with Task Scheduler: Windows Task Scheduler to Scrape Alternative Data - Sep 3, 2019.
In this tutorial, you will learn how to run task scheduler to web scrape data from Lazada (eCommerce) website and dump it into SQLite RDBMS Database.
Data Science, Python, Web Scraping
Object-oriented programming for data scientists: Build your ML estimator - Aug 30, 2019.
Implement some of the core OOP principles in a machine learning context by building your own Scikit-learn-like estimator, and making it better.
Data Scientist, Machine Learning, Programming, Python
- 4 Tips for Advanced Feature Engineering and Preprocessing - Aug 29, 2019.
Techniques for creating new features, detecting outliers, handling imbalanced data, and impute missing values.
Data Preprocessing, Feature Engineering, Python, Tips
Nothing but NumPy: Understanding & Creating Neural Networks with Computational Graphs from Scratch - Aug 23, 2019.
Entirely implemented with NumPy, this extensive tutorial provides a detailed review of neural networks followed by guided code for creating one from scratch with computational graphs.
Backpropagation, Neural Networks, numpy, Python
- Comparing Decision Tree Algorithms: Random Forest® vs. XGBoost - Aug 21, 2019.
Check out this tutorial walking you through a comparison of XGBoost and Random Forest. You'll learn how to create a decision tree, how to do tree bagging, and how to do tree boosting.
ActiveState, Decision Trees, Python, random forests algorithm, XGBoost
- Understanding Decision Trees for Classification in Python - Aug 21, 2019.
This tutorial covers decision trees for classification also known as classification trees, including the anatomy of classification trees, how classification trees make predictions, using scikit-learn to make classification trees, and hyperparameter tuning.
Classification, Decision Trees, Python, scikit-learn
- Automate Stacking In Python: How to Boost Your Performance While Saving Time - Aug 21, 2019.
Utilizing stacking (stacked generalizations) is a very hot topic when it comes to pushing your machine learning algorithm to new heights. For instance, most if not all winning Kaggle submissions nowadays make use of some form of stacking or a variation of it.
Algorithms, Big Data, Data Science, Python
- KDnuggets™ News 19:n31, Aug 21: Become a Marketable Data Scientist; Data Science Command Line Basics; Chatbots with Keras - Aug 21, 2019.
This week's news: Become More Marketable as a Data Scientist; Command Line Basics Every Data Scientist Should Know; Chatbots with Keras!; Understanding Cancer using Machine Learning; Statistical Modelling vs Machine Learning; Is Kaggle Learn a "Faster Data Science Education?"; and much more!
Cancer Detection, Chatbot, Data Science, Data Scientist, Keras, NLP, Python
- An Overview of Python’s Datatable package - Aug 20, 2019.
Modern machine learning applications need to process a humongous amount of data and generate multiple features. Python’s datatable module was created to address this issue. It is a toolkit for performing big data (up to 100GB) operations on a single-node machine, at the maximum possible speed.
Big Data, Data Science, Python
Deep Learning for NLP: Creating a Chatbot with Keras! - Aug 19, 2019.
Learn how to use Keras to build a Recurrent Neural Network and create a Chatbot! Who doesn’t like a friendly-robotic personal assistant?
Chatbot, Deep Learning, Keras, NLP, Python
- Pytorch Lightning vs PyTorch Ignite vs Fast.ai - Aug 16, 2019.
Here, I will attempt an objective comparison between all three frameworks. This comparison comes from laying out similarities and differences objectively found in tutorials and documentation of all three frameworks.
fast.ai, Neural Networks, Python, PyTorch, PyTorch Lightning
- Learn how to use PySpark in under 5 minutes (Installation + Tutorial) - Aug 13, 2019.
Apache Spark is one of the hottest and largest open source project in data processing framework with rich high-level APIs for the programming languages like Scala, Python, Java and R. It realizes the potential of bringing together both Big Data and machine learning.
Apache Spark, Big Data, Data Science, Python
- A 2019 Guide to Semantic Segmentation - Aug 12, 2019.
Semantic segmentation refers to the process of linking each pixel in an image to a class label. These labels could include a person, car, flower, piece of furniture, etc., just to mention a few. We’ll now look at a number of research papers on covering state-of-the-art approaches to building semantic segmentation models.
Pages: 1 2
Image Classification, Image Recognition, Python, Segmentation
- Keras Callbacks Explained In Three Minutes - Aug 9, 2019.
A gentle introduction to callbacks in Keras. Learn about EarlyStopping, ModelCheckpoint, and other callback functions with code examples.
Explained, Keras, Neural Networks, Python
- Introduction to Image Segmentation with K-Means clustering - Aug 9, 2019.
Image segmentation is the classification of an image into different groups. Many kinds of research have been done in the area of image segmentation using clustering. In this article, we will explore using the K-Means clustering algorithm to read an image and cluster different regions of the image.
Clustering, Computer Vision, Image Recognition, K-means, Python, Segmentation
- Exploratory Data Analysis Using Python - Aug 7, 2019.
In this tutorial, you’ll use Python and Pandas to explore a dataset and create visual distributions, identify and eliminate outliers, and uncover correlations between two datasets.
ActiveState, Data Analysis, Data Exploration, Pandas, Python
- Feature selection by random search in Python - Aug 6, 2019.
Feature selection is one of the most important tasks in machine learning. Learn how to use a simple random search in Python to get good results in less time.
Collinearity, Cross-validation, Feature Selection, Python, Random
- 25 Tricks for Pandas - Aug 6, 2019.
Check out this video (and Jupyter notebook) which outlines a number of Pandas tricks for working with and manipulating data, covering topics such as string manipulations, splitting and filtering DataFrames, combining and aggregating data, and more.
Pandas, Python, Tips
- Lagrange multipliers with visualizations and code - Aug 6, 2019.
In this story, we’re going to take an aerial tour of optimization with Lagrange multipliers. When do we need them? Whenever we have an optimization problem with constraints.
Analytics, Mathematics, Optimization, Python
- Pytorch Cheat Sheet for Beginners and Udacity Deep Learning Nanodegree - Aug 2, 2019.
This cheatsheet should be easier to digest than the official documentation and should be a transitional tool to get students and beginners to get started reading documentations soon.
Beginners, Cheat Sheet, Deep Learning, Google Colab, Python, PyTorch, Udacity
- GPU Accelerated Data Analytics & Machine Learning - Aug 2, 2019.
The future is here! Speed up your Machine Learning workflow using Python RAPIDS libraries support.
Analytics, GPU, Machine Learning, Python
- How a simple mix of object-oriented programming can sharpen your deep learning prototype - Aug 1, 2019.
By mixing simple concepts of object-oriented programming, like functionalization and class inheritance, you can add immense value to a deep learning prototyping code.
Deep Learning, Keras, Programming, Python
- KDnuggets™ News 19:n28, Jul 31: Top 13 Skills To Become a Rockstar Data Scientist; Best Podcasts on AI, Analytics, Data Science - Jul 31, 2019.
Learn the essential skills needed to become a Data Science rockstar; Understand CNNs with Python + Tensorflow + Keras tutorial; Discover the best podcasts about AI, Analytics, Data Science; and find out where you can get the best Certificates in the field
Convolutional Neural Networks, Data Preparation, Data Science Certificate, Data Science Skills, Podcast, Python, TensorFlow
- Here’s how you can accelerate your Data Science on GPU - Jul 30, 2019.
Data Scientists need computing power. Whether you’re processing a big dataset with Pandas or running some computation on a massive matrix with Numpy, you’ll need a powerful machine to get the job done in a reasonable amount of time.
Big Data, Data Science, DBSCAN, Deep Learning, GPU, NVIDIA, Python
- Exploring Python Basics. - Jul 29, 2019.
This free ebook is a great resource for data science beginners, providing a good introduction into Python, coding with Raspberry Pi, and using Python to building predictive models.
Beginners, Book, Manning, Python
Convolutional Neural Networks: A Python Tutorial Using TensorFlow and Keras - Jul 26, 2019.
Different neural network architectures excel in different tasks. This particular article focuses on crafting convolutional neural networks in Python using TensorFlow and Keras.
Convolutional Neural Networks, Keras, Neural Networks, Python, TensorFlow
- Easy, One-Click Jupyter Notebooks - Jul 24, 2019.
All of the setup for software, networking, security, and libraries is automatically taken care of by the Saturn Cloud system. Data Scientists can then focus on the actual Data Science and not the tedious infrastructure work that falls around it
Big Data, Cloud, Data Science, Data Scientist, DevOps, Jupyter, Python, Saturn Cloud
- Kaggle Kernels Guide for Beginners: A Step by Step Tutorial - Jul 23, 2019.
This is an attempt to hold the hands of a complete beginner and walk them through the world of Kaggle Kernels — for them to get started.
Kaggle, Python, R
- Things I Learned From the SciPy 2019 Lightning Talks - Jul 22, 2019.
This post summarizes the interesting aspects of the Day One of the SciPy 2019 lightning talks, a flash round of a dozen ~3 minute talks covering a wide variety of topics.
Presentation, Python, SciPy
- Computer Vision for Beginners: Part 1 - Jul 17, 2019.
Image processing is performing some operations on images to get an intended manipulation. Think about what we do when we start a new data analysis. We do some data preprocessing and feature engineering. It’s the same with image processing.
Computer Vision, Deep Learning, Image Processing, Python
Dealing with categorical features in machine learning - Jul 16, 2019.
Many machine learning algorithms require that their input is numerical and therefore categorical features must be transformed into numerical features before we can use any of these algorithms.
Data Cleaning, Data Preprocessing, Feature Engineering, Machine Learning, Python
Training a Neural Network to Write Like Lovecraft - Jul 11, 2019.
In this post, the author attempts to train a neural network to generate Lovecraft-esque prose, known to be awkward and irregular at best. Did it end in success? If not, any suggestions on how it might have? Read on to find out.
Keras, LSTM, Natural Language Generation, Neural Networks, Python, TensorFlow
- 10 Simple Hacks to Speed up Your Data Analysis in Python - Jul 11, 2019.
This article lists some curated tips for working with Python and Jupyter Notebooks, covering topics such as easily profiling data, formatting code and output, debugging, and more. Hopefully you can find something useful within.
Data Analysis, Jupyter, Pandas, Python, Tips
- How to Learn Python without First Needing to Learn Python - Jul 10, 2019.
Learn how data scientists and anyone coding with Python can set up a made-to-order runtime in minutes - not days. Read the 3-minute blog post.
ActiveState, Python
- A Gentle Guide to Starting Your NLP Project with AllenNLP - Jul 10, 2019.
For those who aren’t familiar with AllenNLP, I will give a brief overview of the library and let you know the advantages of integrating it to your project.
Allen Institute, NLP, Python, Sentiment Analysis
- Practical Speech Recognition with Python: The Basics - Jul 9, 2019.
Do you fear implementing speech recognition in your Python apps? Read this tutorial for a simple approach to getting practical with speech recognition using open source Python libraries.
Google, NLP, Python, Speech Recognition
- Annotated Heatmaps of a Correlation Matrix in 5 Simple Steps - Jul 9, 2019.
A heatmap is a graphical representation of data in which data values are represented as colors. That is, it uses color in order to communicate a value to the reader. This is a great tool to assist the audience towards the areas that matter the most when you have a large volume of data.
Data Visualization, Python, Statistics
- XGBoost and Random Forest® with Bayesian Optimisation - Jul 8, 2019.
This article will explain how to use XGBoost and Random Forest with Bayesian Optimisation, and will discuss the main pros and cons of these methods.
Bayesian, Optimization, Python, random forests algorithm, XGBoost
- Classifying Heart Disease Using K-Nearest Neighbors - Jul 8, 2019.
I have written this post for the developers and assumes no background in statistics or mathematics. The focus is mainly on how the k-NN algorithm works and how to use it for predictive modeling problems.
Pages: 1 2
Healthcare, K-nearest neighbors, Machine Learning, Medical, Python
- Building a Recommender System, Part 2 - Jul 3, 2019.
This post explores an technique for collaborative filtering which uses latent factor models, a which naturally generalizes to deep learning approaches. Our approach will be implemented using Tensorflow and Keras.
Movies, Python, Recommendation Engine, Recommender Systems