2018 Aug Tutorials, Overviews
All (100) | Courses, Education (3) | Meetings (13) | News, Features (11) | Opinions, Interviews (15) | Top Stories, Tweets (9) | Tutorials, Overviews (43) | Webcasts & Webinars (6)
-
AI Knowledge Map: How To Classify AI Technologies - Aug 31, 2018.
What follows is then an effort to draw an architecture to access knowledge on AI and follow emergent dynamics, a gateway of pre-existing knowledge on the topic that will allow you to scout around for additional information and eventually create new knowledge on AI. - Optimus v2: Agile Data Science Workflows Made Easy - Aug 30, 2018.
Looking for a library to skyrocket your productivity as Data Scientist? Check this out!
-
Topic Modeling with LSA, PLSA, LDA & lda2Vec - Aug 30, 2018.
This article is a comprehensive overview of Topic Modeling and its associated techniques. - Word Vectors in Natural Language Processing: Global Vectors (GloVe) - Aug 29, 2018.
A well-known model that learns vectors or words from their co-occurrence information is GlobalVectors (GloVe). While word2vec is a predictive model — a feed-forward neural network that learns vectors to improve the predictive ability, GloVe is a count-based model.
- Deploying scikit-learn Models at Scale - Aug 29, 2018.
Find out how to serve your scikit-learn model in an auto-scaling, serverless environment! Today, we’ll take a trained scikit-learn model and deploy it on Cloud ML Engine.
-
Linear Regression In Real Life - Aug 28, 2018.
A helpful guide to Linear Regression, using an example of a friends road trip to Las Vegas to highlight how it can be used in a real life situation. - How to Make Your Machine Learning Models Robust to Outliers - Aug 28, 2018.
In this blog, we’ll try to understand the different interpretations of this “distant” notion. We will also look into the outlier detection and treatment techniques while seeing their impact on different types of machine learning models.
- Are Vectorized Random Number Generators Actually Useful? - Aug 28, 2018.
I reported that you can multiply the speed of common (fast) random number generators such as PCG and xorshift128+ by a factor of three or four by vectorizing them using SIMD instructions. Is this actually useful in practice?
- Multi-Class Text Classification with Scikit-Learn - Aug 27, 2018.
The vast majority of text classification articles and tutorials on the internet are binary text classification such as email spam filtering and sentiment analysis. Real world problem are much more complicated than that.
-
Data Visualization Cheat Sheet - Aug 24, 2018.
Core principles for successful data visualization, including tips on how to reduce clutter, preattentive processing and how to integrate text within the graph. - Emotion and Sentiment Analysis: A Practitioner’s Guide to NLP, by Dipanjan Sarkar - Aug 24, 2018.
Sentiment analysis is widely used, especially as a part of social media analysis for any domain, be it a business, a recent movie, or a product launch, to understand its reception by the people and what they think of it based on their opinions or, you guessed it, sentiment!
-
Comparison of the Most Useful Text Processing APIs - Aug 23, 2018.
There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison. - 9 Things You Should Know About TensorFlow - Aug 22, 2018.
A summary of the key points from the Google Cloud Next in San Francisco, "What’s New with TensorFlow?", including neural networks, TensorFlow Lite, data pipelines and more.
- Docker Cheat Sheet - Aug 21, 2018.
This comprehensive cheat sheet will assist Docker users, experienced and new, in getting containers up-and-running quickly. We list commands that will allow users to install, build, ship and run Docker containers.
- UX Design Guide for Data Scientists and AI Products - Aug 21, 2018.
Realizing that there is a legitimate knowledge gap between UX Designers and Data Scientists, I have decided to attempt addressing the needs from the Data Scientist’s perspective.
- Basic Statistics in Python: Probability - Aug 21, 2018.
At the most basic level, probability seeks to answer the question, "What is the chance of an event happening?" To calculate the chance of an event happening, we also need to consider all the other events that can occur.
- Interpreting a data set, beginning to end - Aug 20, 2018.
Detailed knowledge of your data is key to understanding it! We review several important methods that to understand the data, including summary statistics with visualization, embedding methods like PCA and t-SNE, and Topological Data Analysis.
- Why Automated Feature Engineering Will Change the Way You Do Machine Learning - Aug 20, 2018.
Automated feature engineering will save you time, build better predictive models, create meaningful features, and prevent data leakage.
- Introduction to Fraud Detection Systems - Aug 17, 2018.
Using the Python gradient boosting library LightGBM, this article introduces fraud detection systems, with code samples included to help you get started.
-
Auto-Keras, or How You can Create a Deep Learning Model in 4 Lines of Code - Aug 17, 2018.
Auto-Keras is an open source software library for automated machine learning. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models. - Named Entity Recognition: A Practitioner’s Guide to NLP - Aug 17, 2018.
Named entity recognition (NER) , also known as entity chunking/extraction , is a popular technique used in information extraction to identify and segment the named entities and classify or categorize them under various predefined classes.
- Reinforcement Learning: The Business Use Case, Part 2 - Aug 16, 2018.
In this post, I will explore the implementation of reinforcement learning in trading. The Financial industry has been exploring the applications of Artificial Intelligence and Machine Learning for their use-cases, but the monetary risk has prompted reluctance.
- A Crash Course in MXNet Tensor Basics & Simple Automatic Differentiation - Aug 16, 2018.
This is an overview of some basic functionality of the MXNet ndarray package for creating tensor-like objects, and using the autograd package for performing automatic differentiation.
- An Introduction to t-SNE with Python Example - Aug 15, 2018.
In this post we’ll give an introduction to the exploratory and visualization t-SNE algorithm. t-SNE is a powerful dimension reduction and visualization technique used on high dimensional data.
- AutoKeras: The Killer of Google’s AutoML - Aug 15, 2018.
Auto-Keras is an open source "competitor" to Google’s AutoML, a new cloud software suite of Machine Learning tools. It’s based on Google’s state-of-the-art research in Neural Architecture Search (NAS).
- How to Set Up a Free Data Science Environment on Google Cloud - Aug 15, 2018.
In this post, we'll walk through how to set up a data science environment on Google Cloud Platform (GCP). Because of the economy of scale that cloud hosting companies provide, individuals or teams can affordably access powerful computers.
-
Unveiling Mathematics Behind XGBoost - Aug 14, 2018.
Follow me till the end, and I assure you will atleast get a sense of what is happening underneath the revolutionary machine learning model. - Setting up your AI Dev Environment in 5 Minutes - Aug 13, 2018.
Whether you're a novice data science enthusiast setting up TensorFlow for the first time, or a seasoned AI engineer working with terabytes of data, getting your libraries, packages, and frameworks installed is always a struggle. Learn how datmo, an open source python package, helps you get started in minutes.
- Unsupervised Learning Demystified - Aug 13, 2018.
Unsupervised learning is a pattern-finding technique for mining inspiration from your data. Let's demystify!
-
Understanding Language Syntax and Structure: A Practitioner’s Guide to NLP - Aug 10, 2018.
Knowledge about the structure and syntax of language is helpful in many areas like text processing, annotation, and parsing for further operations such as text classification or summarization. - Building Reliable Machine Learning Models with Cross-validation - Aug 9, 2018.
Cross-validation is frequently used to train, measure and finally select a machine learning model for a given dataset because it helps assess how the results of a model will generalize to an independent data set in practice.
- Reinforcement Learning: The Business Use Case, Part 1 - Aug 9, 2018.
At base, RL is a complex algorithm for mapping observed entities and measures into some set of actions, while optimizing for a long-term or short-term reward.
- Optimization 101 for Data Scientists - Aug 8, 2018.
We show how to use optimization strategies to make the best possible decision.
-
GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows - Aug 8, 2018.
This post includes a wide spectrum of data science projects, all of which are open source and are present on GitHub repositories. -
Programming Best Practices For Data Science - Aug 7, 2018.
In this post, I'll go over the two mindsets most people switch between when doing programming work specifically for data science: the prototype mindset and the production mindset. - Autoregressive Models in TensorFlow - Aug 6, 2018.
This article investigates autoregressive models in TensorFlow, including autoregressive time series and predictions with the actual observations.
-
Only Numpy: Implementing GANs and Adam Optimizer using Numpy - Aug 6, 2018.
This post is an implementation of GANs and the Adam optimizer using only Python and Numpy, with minimal focus on the underlying maths involved. - K-Means in Real Life: Clustering Workout Sessions - Aug 3, 2018.
By using the within-cluster sum of squares as cost function, data points in the same cluster will be similar to each other, whereas data points in different clusters will have a lower level of similarity.
- Text Wrangling & Pre-processing: A Practitioner’s Guide to NLP - Aug 3, 2018.
I will highlight some of the most important steps which are used heavily in Natural Language Processing (NLP) pipelines and I frequently use them in my NLP projects.
- WTF is TF-IDF? - Aug 2, 2018.
Relevant words are not necessarily the most frequent words since stopwords like “the”, “of” or “a” tend to occur very often in many documents.
-
From Data to Viz: how to select the the right chart for your data - Aug 1, 2018.
We offer an interactive, decision tree-style tool, which examines the data you have and proposes a set of potentially appropriate visualizations to represent your dataset. -
Basic Statistics in Python: Descriptive Statistics - Aug 1, 2018.
This article covers defining statistics, descriptive statistics, measures of central tendency, and measures of spread. This article assumes no prior knowledge of statistics, but does require at least a general knowledge of Python. - Selecting the Best Machine Learning Algorithm for Your Regression Problem - Aug 1, 2018.
This post should then serve as a great aid in selecting the best ML algorithm for you regression problem!