2019 May Tutorials, Overviews
All (83) | Courses, Education (2) | Meetings (3) | News (5) | Opinions (30) | Top Stories, Tweets (10) | Tutorials, Overviews (31) | Webcasts & Webinars (2)
- Why physical storage of your database tables might matter
- May 31, 2019.
Follow this investigation into why physical storage of your database tables might matter, from problem identification to possible issue resolutions.
- Understanding Backpropagation as Applied to LSTM
- May 30, 2019.
Backpropagation is one of those topics that seem to confuse many once you move past feed-forward neural networks and progress to convolutional and recurrent neural networks. This article gives you and overall process to understanding back propagation by giving you the underlying principles of backpropagation.
- Who is your Golden Goose?: Cohort Analysis
- May 30, 2019.
Step-by-step tutorial on how to perform customer segmentation using RFM analysis and K-Means clustering in Python.
- Animations with Matplotlib
- May 30, 2019.
Animations make even more sense when depicting time series data like stock prices over the years, climate change over the past decade, seasonalities and trends since we can then see how a particular parameter behaves with time.
- Becoming a Level 3.0 Data Scientist
- May 29, 2019.
Want to be a Junior, Senior, or Principal Data Scientists? Find out what you need to do to navigate the Data Science Career Game.
- Choosing Between Model Candidates
- May 29, 2019.
Models are useful because they allow us to generalize from one situation to another. When we use a model, we’re working under the assumption that there is some underlying pattern we want to measure, but it has some error on top of it.
- Boost Your Image Classification Model
- May 27, 2019.
Check out this collection of tricks to improve the accuracy of your classifier.
- Careful! Looking at your model results too much can cause information leakage
- May 24, 2019.
We all are aware of the issue of overfitting, which is essentially where the model you build replicates the training data results so perfectly its fitted to the training data and does not generalise to better represent the population the data comes to, with catastrophic results when you feed in new data and get very odd results.
- Analyzing Tweets with NLP in Minutes with Spark, Optimus and Twint
- May 24, 2019.
Social media has been gold for studying the way people communicate and behave, in this article I’ll show you the easiest way of analyzing tweets without the Twitter API and scalable for Big Data.
- Your Guide to Natural Language Processing (NLP)
- May 23, 2019.
This extensive post covers NLP use cases, basic examples, Tokenization, Stop Words Removal, Stemming, Lemmatization, Topic Modeling, the future of NLP, and more.
- End-to-End Machine Learning: Making videos from images
- May 23, 2019.
Video is a natural way for us to understand three dimensional and time varying information. Read this short post on how to achieve the creation of videos from still images.
- When Too Likely Human Means Not Human: Detecting Automatically Generated Text
- May 23, 2019.
Passably-human automated text generation is a reality. How do we best go about detecting it? As it turns out, being too predictably human may actually be a reasonably good indicator of not being human at all.
- Extracting Knowledge from Knowledge Graphs Using Facebook’s Pytorch-BigGraph
- May 22, 2019.
We are using the state-of-the-art Deep Learning tools to build a model for predict a word using the surrounding words as labels.
- Probability Mass and Density Functions
- May 21, 2019.
This content is part of a series about the chapter 3 on probability from the Deep Learning Book by Goodfellow, I., Bengio, Y., and Courville, A. (2016). It aims to provide intuitions/drawings/python code on mathematical theories and is constructed as my understanding of these concepts.
- Building a Computer Vision Model: Approaches and datasets
- May 20, 2019.
How can we build a computer vision model using CNNs? What are existing datasets? And what are approaches to train the model? This article provides an answer to these essential questions when trying to understand the most important concepts of computer vision.
-
60+ useful graph visualization libraries - May 17, 2019.
We outline 60+ graph visualization libraries that allow users to build applications to display and interact with network representations of data. - PyCharm for Data Scientists
- May 17, 2019.
This article is a discussion of some of PyCharm's features, and a comparison with Spyder, an another popular IDE for Python. Read on to find the benefits and drawbacks of PyCharm, and an outline of when to prefer it to Spyder and vice versa.
-
7 Steps to Mastering SQL for Data Science — 2019 Edition - May 17, 2019.
Follow these updated 7 steps to go from SQL data science newbie to practitioner in a hurry. We consider only the necessary concepts and skills, and provide quality resources for each. - A complete guide to K-means clustering algorithm
- May 16, 2019.
Clustering - including K-means clustering - is an unsupervised learning technique used for data classification. We provide several examples to help further explain how it works.
- Large-Scale Evolution of Image Classifiers
- May 16, 2019.
Deep neural networks excel in many difficult tasks, given large amounts of training data and enough processing power. The neural network architecture is an important factor in achieving a highly accurate model... Techniques to automatically discover these neural network architectures are, therefore, very much desirable.
- Building Recommender systems with Azure Machine Learning service
- May 15, 2019.
Microsoft has provided a GitHub repository with Python best practice examples to facilitate the building and evaluation of recommendation systems using Azure Machine Learning services.
-
Mathematical programming — Key Habit to Build Up for Advancing Data Science - May 15, 2019.
We show how, by simulating the random throw of a dart, you can compute the value of pi approximately. This is a small step towards building the habit of mathematical programming, which should be a key skill in the repertoire of a budding data scientist. - Customer Churn Prediction Using Machine Learning: Main Approaches and Models
- May 14, 2019.
We reach out to experts from HubSpot and ScienceSoft to discuss how SaaS companies handle the problem of customer churn prediction using Machine Learning.
- A Complete Exploratory Data Analysis and Visualization for Text Data: Combine Visualization and NLP to Generate Insights
- May 9, 2019.
Visually representing the content of a text document is one of the most important tasks in the field of text mining as a Data Scientist or NLP specialist. However, there are some gaps between visualizing unstructured (text) data and structured data.
- How to fix an Unbalanced Dataset
- May 8, 2019.
We explain several alternative ways to handle imbalanced datasets, including different resampling and ensembling methods with code examples.
- Linear Programming and Discrete Optimization with Python using PuLP
- May 8, 2019.
Knowledge of such optimization techniques is extremely useful for data scientists and machine learning (ML) practitioners as discrete and continuous optimization lie at the heart of modern ML and AI systems as well as data-driven business analytics processes.
- Best US/Canada Masters in Analytics, Business Analytics, Data Science
- May 7, 2019.
In the final part of this series, we provide an updated list of our comprehensive, unbiased survey of graduate programs in Data Science and Analytics from across the US and Canada.
- How to Automate Tasks on GitHub With Machine Learning for Fun and Profit
- May 3, 2019.
Check this tutorial on how to build a GitHub App that predicts and applies issue labels using Tensorflow and public datasets.
- Modeling Price with Regularized Linear Model & XGBoost
- May 2, 2019.
We are going to implement regularization techniques for linear regression of house pricing data. Our goal in price modeling is to model the pattern and ignore the noise.
-
How to correctly select a sample from a huge dataset in machine learning - May 1, 2019.
We explain how choosing a small, representative dataset from a large population can improve model training reliability. - Build Your First Chatbot Using Python & NLTK
- May 1, 2019.
Today we will learn to create a simple chat assistant or chatbot using Python’s NLTK library.