2018 Jul Tutorials, Overviews
All (104) | Courses, Education (5) | Meetings (15) | News, Features (8) | Opinions, Interviews (25) | Top Stories, Tweets (9) | Tutorials, Overviews (36) | Webcasts & Webinars (6)
- Google’s AutoML: Cutting Through the Hype - Jul 31, 2018.
In today’s post, I want to look specifically at Google’s AutoML, a product which has received a lot of media attention, and address "What is Google's AutoML?" and more.
- Intuitive Ensemble Learning Guide with Gradient Boosting - Jul 30, 2018.
This tutorial discusses the importance of ensemble learning with gradient boosting as a study case.
- DevOps for Data Scientists: Taming the Unicorn - Jul 27, 2018.
How do we version control the model and add it to an app? How will people interact with our website based on the outcome? How will it scale!?
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
- Data Retrieval with Web Scraping: A Practitioner’s Guide to NLP - Jul 26, 2018.
Proven and tested hands-on strategies to tackle NLP tasks.
-
How to Build a Data Science Portfolio - Jul 25, 2018.
This post will include links to where various data science professionals (data science managers, data scientists, social media icons, or some combination thereof) and others talk about what to have in a portfolio and how to get noticed. -
Genetic Algorithm Implementation in Python - Jul 24, 2018.
This tutorial will implement the genetic algorithm optimization technique in Python based on a simple example in which we are trying to maximize the output of an equation. -
Cookiecutter Data Science: How to Organize Your Data Science Project - Jul 24, 2018.
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. -
Comparison of Top 6 Python NLP Libraries - Jul 23, 2018.
Today, we want to outline and compare the most popular and helpful natural language processing libraries, based on our experience. - Receiver Operating Characteristic Curves Demystified (in Python) - Jul 20, 2018.
In this blog, I will reveal, step by step, how to plot an ROC curve using Python. After that, I will explain the characteristics of a basic ROC curve.
- The ultimate list of Web Scraping tools and software - Jul 19, 2018.
Here's your guide to pick the right web scraping tool for your specific data needs.
- Best (and Free!!) Resources to Understand Nuts and Bolts of Deep Learning - Jul 19, 2018.
This blog is however not addressing the absolute beginner. Once you have a bit of intuition about how Deep Learning algorithms work, you might want to understand how things work below the hood.
-
Explaining the 68-95-99.7 rule for a Normal Distribution - Jul 19, 2018.
This post explains how those numbers were derived in the hope that they can be more interpretable for your future endeavors. - Efficient Graph-based Word Sense Induction - Jul 18, 2018.
This paper describes a set of algorithms for Natural Language Processing (NLP) that match or exceed the state of the art on several evaluation tasks, while also being much more computationally efficient.
-
5 Quick and Easy Data Visualizations in Python with Code - Jul 18, 2018.
This post provides an overview of a small number of widely used data visualizations, and includes code in the form of functions to implement each in Python using Matplotlib. - Clustering Using K-means Algorithm - Jul 18, 2018.
This article explains K-means algorithm in an easy way. I’d like to start with an example to understand the objective of this powerful technique in machine learning before getting into the algorithm, which is quite simple.
- Basic Image Processing in Python, Part 2 - Jul 17, 2018.
We explain how to easily access and manipulate the internal components of digital images using Python and give examples from satellite image processing.
- BigQuery vs Redshift: Pricing Strategy - Jul 17, 2018.
In this blog post, we’re going to break down BigQuery vs Redshift pricing structures and see how they work in detail.
- fast.ai Deep Learning Part 2 Complete Course Notes - Jul 17, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 2 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
-
Beginners Ask “How Many Hidden Layers/Neurons to Use in Artificial Neural Networks?” - Jul 16, 2018.
By the end of this article, you could at least get the idea of how these questions are answered and be able to test yourself based on simple examples. -
Text Mining on the Command Line - Jul 13, 2018.
In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood. -
Dimensionality Reduction : Does PCA really improve classification outcome? - Jul 13, 2018.
In this post, I am going to verify this statement using a Principal Component Analysis ( PCA ) to try to improve the classification performance of a neural network over a dataset. - Basic Image Data Analysis Using Numpy and OpenCV – Part 1 - Jul 10, 2018.
Accessing the internal component of digital images using Python packages becomes more convenient to understand its properties as well as nature.
-
Analyze a Soccer (Football) Game Using Tensorflow Object Detection and OpenCV - Jul 10, 2018.
For the data scientist within you let's use this opportunity to do some analysis on soccer clips. With the use of deep learning and opencv we can extract interesting insights from video clips - fast.ai Deep Learning Part 1 Complete Course Notes - Jul 10, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai deep learning part 1 MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
- The 4 Levels of Data Usage in Data Science - Jul 9, 2018.
This is an overview of the 4 levels, or "buckets," of data usage in business, starting at monitoring and progressing to automation.
- Introduction to Apache Spark - Jul 6, 2018.
This is the first blog in this series to analyze Big Data using Spark. It provides an introduction to Spark and its ecosystem.
- fast.ai Machine Learning Course Notes - Jul 6, 2018.
This posts is a collection of a set of fantastic notes on the fast.ai machine learning MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.
-
5 of Our Favorite Free Visualization Tools - Jul 5, 2018.
5 key free data visualization tools that can provide flexible and effective data presentation. - Manage your Machine Learning Lifecycle with MLflow – Part 1 - Jul 5, 2018.
Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.
- Text Classification & Embeddings Visualization Using LSTMs, CNNs, and Pre-trained Word Vectors - Jul 5, 2018.
In this tutorial, I classify Yelp round-10 review datasets. After processing the review comments, I trained three model in three different ways and obtained three word embeddings.
- Deep Learning Tips and Tricks - Jul 4, 2018.
This post is a distilled collection of conversations, messages, and debates on how to optimize deep models. If you have tricks you’ve found impactful, please share them in the comments below!
- Overview and benchmark of traditional and deep learning models in text classification - Jul 3, 2018.
In this post, traditional and deep learning models in text classification will be thoroughly investigated, including a discussion into both Recurrent and Convolutional neural networks.
- Deep Quantile Regression - Jul 3, 2018.
Most Deep Learning frameworks currently focus on giving a best estimate as defined by a loss function. Occasionally something beyond a point estimate is required to make a decision. This is where a distribution would be useful. This article will purely focus on inferring quantiles.
- Data Retrieval and Cleaning: Tracking Migratory Patterns - Jul 3, 2018.
In this post, we walk through investigating, retrieving, and cleaning a real world data set. We will also describe the cost benefits and necessary tools involved in building your own data sets.
-
Automated Machine Learning vs Automated Data Science - Jul 2, 2018.
Just by adding the term "automated" in front of these 2 separate, distinct concepts does not somehow make them equivalent. Machine learning and data science are not the same thing.