2018 Jul Tutorials, Overviews

All (42) | News, Features (3) | Opinions, Interviews (11) | Tutorials, Overviews (28)

Intuitive Ensemble Learning Guide with Gradient Boosting

This tutorial discusses the importance of ensemble learning with gradient boosting as a study case.

on Jul 30, 2018 in Ensemble Methods, Gradient Boosting, Python
DevOps for Data Scientists: Taming the Unicorn

How do we version control the model and add it to an app? How will people interact with our website based on the outcome? How will it scale!?

on Jul 27, 2018 in Data Science, Data Scientist, DevOps, Unicorn, Version Control
Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks

Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.

on Jul 27, 2018 in Jupyter, Machine Learning, Microsoft, Python, R, SQL, SQL Server
Data Retrieval with Web Scraping: A Practitioner’s Guide to NLP

Proven and tested hands-on strategies to tackle NLP tasks.

on Jul 26, 2018 in Data Preprocessing, NLP, Text Analytics, Workflow
How to Build a Data Science Portfolio

This post will include links to where various data science professionals (data science managers, data scientists, social media icons, or some combination thereof) and others talk about what to have in a portfolio and how to get noticed.

on Jul 25, 2018 in Advice, Career, Data Science, Portfolio, Resume, Social Media
Genetic Algorithm Implementation in Python

This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation.

on Jul 24, 2018 in Algorithms, Genetic Algorithm, Python
Cookiecutter Data Science: How to Organize Your Data Science Project

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

on Jul 24, 2018 in Data Science, Programming, Project, Python
Comparison of Top 6 Python NLP Libraries

Today, we want to outline and compare the most popular and helpful natural language processing libraries, based on our experience.

on Jul 23, 2018 in NLP, Python
Receiver Operating Characteristic Curves Demystified (in Python)

In this blog, I will reveal, step by step, how to plot an ROC curve using Python. After that, I will explain the characteristics of a basic ROC curve.

on Jul 20, 2018 in Machine Learning, Metrics, Python, ROC-AUC
The ultimate list of Web Scraping tools and software

Here's your guide to pick the right web scraping tool for your specific data needs.

on Jul 19, 2018 in Data, import.io, Mozenda, Octoparse, ParseHub, Web Mining, Web Scraping
Explaining the 68-95-99.7 rule for a Normal Distribution

This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors.

on Jul 19, 2018 in Data Analysis, Data Science, Normal Distribution, Python, Statistics
5 Quick and Easy Data Visualizations in Python with Code

This post provides an overview of a small number of widely used data visualizations, and includes code in the form of functions to implement each in Python using Matplotlib.

on Jul 18, 2018 in Data Visualization, Matplotlib, Python
Clustering Using K-means Algorithm

This article explains K-means algorithm in an easy way. I’d like to start with an example to understand the objective of this powerful technique in machine learning before getting into the algorithm, which is quite simple.

on Jul 18, 2018 in Algorithms, Clustering, K-means
Basic Image Processing in Python, Part 2

We explain how to easily access and manipulate the internal components of digital images using Python and give examples from satellite image processing.

on Jul 17, 2018 in Computer Vision, Image Processing, numpy, Python
fast.ai Deep Learning Part 2 Complete Course Notes

This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 2 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.

on Jul 17, 2018 in Deep Learning, fast.ai, Jeremy Howard, MOOC
Beginners Ask “How Many Hidden Layers/Neurons to Use in Artificial Neural Networks?”

By the end of this article, you could at least get the idea of how these questions are answered and be able to test yourself based on simple examples.

on Jul 16, 2018 in Architecture, Deep Learning, Hyperparameter, Neural Networks
Text Mining on the Command Line

In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood.

on Jul 13, 2018 in Data Preparation, Data Preprocessing, NLP, Text Mining
Basic Image Data Analysis Using Numpy and OpenCV – Part 1

Accessing the internal component of digital images using Python packages becomes more convenient to understand its properties as well as nature.

on Jul 10, 2018 in Computer Vision, Image Processing, numpy, OpenCV, Python
Analyze a Soccer (Football) Game Using Tensorflow Object Detection and OpenCV

For the data scientist within you let's use this opportunity to do some analysis on soccer clips. With the use of deep learning and opencv we can extract interesting insights from video clips

on Jul 10, 2018 in Football, Image Recognition, Object Detection, OpenCV, Python, Soccer, TensorFlow, Video recognition, World Cup
fast.ai Deep Learning Part 1 Complete Course Notes

This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 1 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.

on Jul 10, 2018 in Deep Learning, fast.ai, Jeremy Howard, MOOC
The 4 Levels of Data Usage in Data Science

This is an overview of the 4 levels, or "buckets," of data usage in business, starting at monitoring and progressing to automation.

on Jul 9, 2018 in Automation, Ben Lorica, Business, Data Science, O'Reilly
Introduction to Apache Spark

This is the first blog in this series to analyze Big Data using Spark. It provides an introduction to Spark and its ecosystem.

on Jul 6, 2018 in Apache Spark, Data Processing, Distributed Systems
fast.ai Machine Learning Course Notes

This posts is a collection of a set of fantastic notes on the fast.ai machine learning MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.

on Jul 6, 2018 in fast.ai, Jeremy Howard, Machine Learning, MOOC
5 of Our Favorite Free Visualization Tools

5 key free data visualization tools that can provide flexible and effective data presentation.

on Jul 5, 2018 in Analytics, D3.js, Data Science, Data Visualization, Free Software, R, Tableau
Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors

In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.

on Jul 5, 2018 in Convolutional Neural Networks, Keras, LSTM, NLP, Python, Text Classification, Word Embeddings
Overview and benchmark of traditional and deep learning models in text classification

In this post, traditional and deep learning models in text classification will be thoroughly investigated, including a discussion into both Recurrent and Convolutional neural networks.

on Jul 3, 2018 in Deep Learning, NLP, Text Classification
Deep Quantile Regression

Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.

on Jul 3, 2018 in Deep Learning, Hyperparameter, Keras, Neural Networks, Python, Regression
Automated Machine Learning vs Automated Data Science

Just by adding the term "automated" in front of these 2 separate, distinct concepts does not somehow make them equivalent. Machine learning and data science are not the same thing.

on Jul 2, 2018 in Automated Data Science, Automated Machine Learning, Data Science, Machine Learning

2018 Jul Tutorials, Overviews

Latest Posts

Top Posts