2019 Sep Tutorials, Overviews
All (97) | Courses, Education (2) | Meetings (5) | News (6) | Opinions (27) | Top Stories, Tweets (10) | Tutorials, Overviews (42) | Webcasts & Webinars (5)
-
Know Your Data: Part 1 - Sep 30, 2019.
This article will introduce the different type of data sets, data object and attributes. - DeepMind Has Quietly Open Sourced Three New Impressive Reinforcement Learning Frameworks - Sep 30, 2019.
Three new releases that will help researchers streamline the implementation of reinforcement learning programs.
- Using Time Series Encodings to Discover Baseball History’s Most Interesting Seasons - Sep 27, 2019.
Take me out to the ballgame! Take me out to the crowd! For the 2,829 seasons that have been played for 101 baseball teams since 1880, which seasons were unlike any others? Using SAX Encoding to recognize patterns in time series data, the most special years in baseball can be found.
- What is Hierarchical Clustering? - Sep 27, 2019.
The article contains a brief introduction to various concepts related to Hierarchical clustering algorithm.
-
The Future of Analytics and Data Science - Sep 26, 2019.
Learn about the current and future issues of data science and possible solutions from this interview with IADSS Co-founder, Dr. Usama Fayyad following his keynote speech at ODSC Boston 2019. - Natural Language in Python using spaCy: An Introduction - Sep 26, 2019.
This article provides a brief introduction to working with natural language (sometimes called “text analytics”) in Python using spaCy and related libraries.
- Customer Segmentation for R Users - Sep 26, 2019.
This article shows you how to separate your customers into distinct groups based on their purchase behavior. For the R enthusiasts out there, I demonstrated what you can do with r/stats, ggradar, ggplot2, animation, and factoextra.
- Beta Distribution: What, When & How - Sep 25, 2019.
This article covers the beta distribution, and explains it using baseball batting averages.
- Automatic Version Control for Data Scientists - Sep 24, 2019.
How can you keep your machine learning models and data organized so you can collaborate effectively? Discover this new tool set available for better version control designed for the data scientist workflow.
- A 2019 Guide for Automatic Speech Recognition - Sep 24, 2019.
In this article, we’ll look at a couple of papers aimed at solving the problem of automated speech recognition with machine and deep learning.
- A Single Function to Streamline Image Classification with Keras - Sep 23, 2019.
We show, step-by-step, how to construct a single, generalized, utility function to pull images automatically from a directory and train a convolutional neural net model.
- Introducing IceCAPS: Microsoft’s Framework for Advanced Conversation Modeling - Sep 23, 2019.
The new open source framework that brings multi-task learning to conversational agents.
- A Gentle Introduction to PyTorch 1.2 - Sep 20, 2019.
This comprehensive tutorial aims to introduce the fundamentals of PyTorch building blocks for training neural networks.
- Automate Hyperparameter Tuning for Your Models - Sep 20, 2019.
When we create our machine learning models, a common task that falls on us is how to tune them. So that brings us to the quintessential question: Can we automate this process?
- Scikit-Learn & More for Synthetic Dataset Generation for Machine Learning - Sep 19, 2019.
While mature algorithms and extensive open-source libraries are widely available for machine learning practitioners, sufficient data to apply these techniques remains a core challenge. Discover how to leverage scikit-learn and other tools to generate synthetic data appropriate for optimizing and fine-tuning your models.
- Applying Data Science to Cybersecurity Network Attacks & Events - Sep 19, 2019.
Check out this detailed tutorial on applying data science to the cybersecurity domain, written by an individual with backgrounds in both fields.
- The 5 Sampling Algorithms every Data Scientist need to know - Sep 18, 2019.
Algorithms are at the core of data science and sampling is a critical technical that can make or break a project. Learn more about the most common sampling techniques used, so you can select the best approach while working with your data.
- Reddit Post Classification - Sep 18, 2019.
This article covers the implementation of a data scraping and natural language processing project which had two parts: scrape as many posts from Reddit’s API as allowed &then use classification models to predict the origin of the posts.
-
Explore the world of Bioinformatics with Machine Learning - Sep 17, 2019.
The article contains a brief introduction of Bioinformatics and how a machine learning classification algorithm can be used to classify the type of cancer in each patient by their gene expressions. - BERT, RoBERTa, DistilBERT, XLNet: Which one to use? - Sep 17, 2019.
Lately, varying improvements over BERT have been shown — and here I will contrast the main similarities and differences so you can choose which one to use in your research or application.
- How Bad is Multicollinearity? - Sep 17, 2019.
For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.
- 5 Step Guide to Scalable Deep Learning Pipelines with d6tflow - Sep 16, 2019.
How to turn a typical pytorch script into a scalable d6tflow DAG for faster research & development.
- What is Machine Behavior? - Sep 16, 2019.
The new emerging field that wants to study AI agents the way social scientists study humans.
- Version Control for Data Science: Tracking Machine Learning Models and Datasets - Sep 13, 2019.
I am a Git god, why do I need another version control system for Machine Learning Projects?
- The State of Transfer Learning in NLP - Sep 13, 2019.
This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP organized by Matthew Peters, Swabha Swayamdipta, Thomas Wolf, and Sebastian Ruder. This post highlights key insights and takeaways and provides updates based on recent work.
- Ensemble Methods for Machine Learning: AdaBoost - Sep 12, 2019.
It turned out that, if we ask the weak algorithm to create a whole bunch of classifiers (all weak for definition), and then combine them all, what may figure out is a stronger classifier.
- A Friendly Introduction to Support Vector Machines - Sep 12, 2019.
This article explains the Support Vector Machines (SVM) algorithm in an easy way.
- Can graph machine learning identify hate speech in online social networks? - Sep 11, 2019.
Online hate speech is a complex subject. Follow this demonstration using state-of-the-art graph neural network models to detect hateful users based on their activities on the Twitter social network.
-
Train sklearn 100x Faster - Sep 11, 2019.
As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution. - Scikit-Learn vs mlr for Machine Learning - Sep 10, 2019.
How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.
-
The 5 Graph Algorithms That Data Scientists Should Know - Sep 10, 2019.
In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python. - A 2019 Guide to Speech Synthesis with Deep Learning - Sep 9, 2019.
In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.
- OpenStreetMap Data to ML Training Labels for Object Detection - Sep 9, 2019.
I am really interested in creating a tight, clean pipeline for disaster relief applications, where we can use something like crowd sourced building polygons from OSM to train a supervised object detector to discover buildings in an unmapped location.
- How DeepMind and Waymo are Using Evolutionary Competition to Train Self-Driving Vehicles - Sep 9, 2019.
Recently, Alphabet’s subsidiaries Waymo and DeepMind partnered to find a more efficient process to train self-driving vehicles algorithms and their work took them back to one of the cornerstones of our history as species: evolution.
-
10 Great Python Resources for Aspiring Data Scientists - Sep 9, 2019.
This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey. - Build Your First Voice Assistant - Sep 6, 2019.
Hone your practical speech recognition application skills with this overview of building a voice assistant using Python.
- An Easy Introduction to Machine Learning Recommender Systems - Sep 4, 2019.
Recommender systems are an important class of machine learning algorithms that offer "relevant" suggestions to users. Categorized as either collaborative filtering or a content-based system, check out how these approaches work along with implementations to follow from example code.
-
Python Libraries for Interpretable Machine Learning - Sep 4, 2019.
In the following post, I am going to give a brief guide to four of the most established packages for interpreting and explaining machine learning models. - An Overview of Topics Extraction in Python with Latent Dirichlet Allocation - Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
- Beyond Neurons: Five Cognitive Functions of the Human Brain that we are Trying to Recreate with Artificial Intelligence - Sep 3, 2019.
The quest for recreating cognitive capabilities of the brain in deep neural networks remains one of the elusive goals of AI. Let’s explore some human cognitive skills that are serving as inspiration to a new generation of AI techniques.
- Automate your Python Scripts with Task Scheduler: Windows Task Scheduler to Scrape Alternative Data - Sep 3, 2019.
In this tutorial, you will learn how to run task scheduler to web scrape data from Lazada (eCommerce) website and dump it into SQLite RDBMS Database.
- Top 10 Data Science Use Cases in Energy and Utilities - Sep 2, 2019.
In this article, we will consider the most vivid data science use cases in the industry of energy and utilities.