2019 Oct

All (54) | Opinions (11) | Tutorials, Overviews (43)

How to Make an Agile Team Work for Big Data Analytics

Learn how to approach the challenges when merging an agile methodology into a data science team to bring out the best value for your Big Data products.

on Oct 31, 2019 in Agile, Big Data, Big Data Analytics, Data Science Team
How to Build Your Own Logistic Regression Model in Python

A hands on guide to Logistic Regression for aspiring data scientist and machine learning engineer.

on Oct 31, 2019 in Logistic Regression, Machine Learning, Python
AutoML for Temporal Relational Data: A New Frontier

While AutoML started out as an automation approach to develop optimal machine learning pipelines, extensions of AutoML to Data Science embedded products can now enable the processing of much more, including temporal relational data.

on Oct 30, 2019 in AutoML, KDD, Temporal Data, Time Series
Research Guide for Transformers

The problem with RNNs and CNNs is that they aren’t able to keep up with context and content when sentences are too long. This limitation has been solved by paying attention to the word that is currently being operated on. This guide will focus on how this problem can be addressed by Transformers with the help of deep learning.

on Oct 30, 2019 in BERT, NLP, Research, Transformer, ULMFiT
5 Statistical Traps Data Scientists Should Avoid

Here are five statistical fallacies — data traps — which data scientists should be aware of and definitely avoid.

on Oct 30, 2019 in Bias, Fallacies, Simpson's Paradox, Statistics
Why is Machine Learning Deployment Hard?

Developing an excellent machine learning model is one thing. Deploying it to production is another. Consider these lessons learned and recommendations for approaching this important challenge to help ensure value from your AI work.

on Oct 29, 2019 in Deployment, Machine Learning
How to Extend Scikit-learn and Bring Sanity to Your Machine Learning Workflow

In this post, learn how to extend Scikit-learn code to make your experiments easier to maintain and reproduce.

on Oct 29, 2019 in Machine Learning, Python, scikit-learn, Software Engineering, Workflow
Data Sources 101

Data collection is one of the first steps of the data lifecycle — you need to get all the data you require in the first place. To collect the right data, you need to know where to find it and determine the effort involved in collecting it. This article answers the most basic question: where does all the data you need (or might need) come from?

on Oct 28, 2019 in Big Data, Data Science, Datasets, Unstructured data
How Bayes’ Theorem is Applied in Machine Learning

Learn how Bayes Theorem is in Machine Learning for classification and regression!

on Oct 28, 2019 in Bayes Theorem, Machine Learning, Naive Bayes, Probability
DeepMind is Using This Old Technique to Evaluate Fairness in Machine Learning Models

Visualizing the datasets is an essential component to identify potential sources of bias and unfairness. DeepMind relied on a method called Causal Bayesian networks (CBNs) to represent and estimate unfairness in a dataset.

on Oct 28, 2019 in Bayesian Networks, DeepMind, Machine Learning
5 Advanced Features of Pandas and How to Use Them

The pandas library offers core functionality when preparing your data using Python. But, many don't go beyond the basics, so learn about these lesser-known advanced methods that will make handling your data easier and cleaner.

on Oct 25, 2019 in Data Preparation, Pandas, Python
Introduction to Natural Language Processing (NLP)

Have you ever wondered how your personal assistant (e.g: Siri) is built? Do you want to build your own? Perfect! Let’s talk about Natural Language Processing.

on Oct 25, 2019 in Beginners, NLP
Feature Selection: Beyond feature importance?

In this post, you will see 3 different techniques of how to do Feature Selection to your datasets and how to build an effective predictive model.

on Oct 24, 2019 in Feature Selection, Machine Learning
Convolutional Neural Network for Breast Cancer Classification

See how Deep Learning can help in solving one of the most commonly diagnosed cancer in women.

on Oct 24, 2019 in Cancer Detection, Deep Learning, Healthcare, Python
Intro to Adversarial Machine Learning and Generative Adversarial Networks

In this crash course on GANs, we explore where they fit into the pantheon of generative models, how they've changed over time, and what the future has in store for this area of machine learning.

on Oct 23, 2019 in Adversarial, AI, GANs, Generative Adversarial Network, Machine Learning
How to Measure Foot Traffic Using Data Analytics

You need to know how many people visit your store now and what sort of audience you're acquiring. Foot traffic data is going to be invaluable to the success of your business.

on Oct 23, 2019 in Business, Data Analytics, Traffic
Time Series Analysis: A Simple Example with KNIME and Spark

The task: train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi dataset.

on Oct 23, 2019 in Apache Spark, Knime, Rosaria Silipo, Seasonality, Time Series
Everything a Data Scientist Should Know About Data Management

For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.

on Oct 22, 2019 in Data Management, Data Scientist, Hadoop
How to Write Web Apps Using Simple Python for Data Scientists

Convert your Data Science Projects into cool apps easily without knowing any web frameworks.

on Oct 22, 2019 in Apps, Data Science, Data Scientist, Python
Anomaly Detection, A Key Task for AI and Machine Learning, Explained

One way to process data faster and more efficiently is to detect abnormal events, changes or shifts in datasets. Anomaly detection refers to identification of items or events that do not conform to an expected pattern or to other items in a dataset that are usually undetectable by a human expert.

on Oct 21, 2019 in AI, Anomaly Detection, Explained, Sciforce, Unsupervised Learning
How YouTube is Recommending Your Next Video

If you are interested in learning more about the latest Youtube recommendation algorithm paper, read this post for details on its approach and improvements.

on Oct 21, 2019 in Recommendation Engine, Recommender Systems, Video, Youtube
This Microsoft Neural Network can Answer Questions About Scenic Images with Minimum Training

Recently, a group of AI experts from Microsoft Research published a paper proposing a method for scene understanding that combines two key tasks: image captioning and visual question answering (VQA).

on Oct 21, 2019 in Image Recognition, Microsoft, Neural Networks, Question answering, Training
5 Tips for Novice Freelance Data Scientists

If you want to launch your data science skills into freelance work, then check out these important tips to help you kick start your next adventure in data.

on Oct 18, 2019 in Advice, Beginners, Consulting, Data Scientist, Freelance
Writing Your First Neural Net in Less Than 30 Lines of Code with Keras

Read this quick overview of neural networks and learn how to implement your first in very few lines using Keras.

on Oct 18, 2019 in Keras, Neural Networks, Python
Data Anonymization – History and Key Ideas

While effective anonymization technology remains elusive, understanding the history of this challenge can guide data science practitioners to address these important concerns through ethical and responsible use of sensitive information.

on Oct 17, 2019 in Anonymity, Anonymized, History, Netflix, Privacy
Artificial Intelligence: Salaries Heading Skyward

While the average salary for a Software Engineer is around $100,000 to $150,000, to make the big bucks you want to be an AI or Machine Learning (Specialist/Scientist/Engineer.)

on Oct 17, 2019 in AI, Machine Learning Engineer, Machine Learning Scientist, Salary
How to Easily Deploy Machine Learning Models Using Flask

This post aims to make you get started with putting your trained machine learning models into production using Flask API.

on Oct 17, 2019 in Deployment, Flask, Machine Learning, Python
How to Become a (Good) Data Scientist – Beginner Guide

A guide covering the things you should learn to become a data scientist, including the basics of business intelligence, statistics, programming, and machine learning.

on Oct 16, 2019 in Beginners, BI, Data Scientist, Sciforce, Statistics
Probability Learning: Bayes’ Theorem

Learn about one of the fundamental theorems of probability with an easy everyday example.

on Oct 16, 2019 in Bayes Theorem, Naive Bayes, Probability
The 5 Classification Evaluation Metrics Every Data Scientist Must Know

This post is about various evaluation metrics and how and when to use them.

on Oct 16, 2019 in Data Scientist, Machine Learning, Metrics, Python
Top 7 Things I Learned in my Data Science Masters

Even though I’m still in my studies, here’s a list of the most important things I’ve learned (as of yet).

on Oct 15, 2019 in Data Science, Data Science Education, Tips
Research Guide for Video Frame Interpolation with Deep Learning

In this research guide, we’ll look at deep learning papers aimed at synthesizing video frames within an existing video.

on Oct 15, 2019 in Computer Vision, Deep Learning, Neural Networks, Video recognition
Three Things to Know About Reinforcement Learning

As an engineer, scientist, or researcher, you may want to take advantage of this new and growing technology, but where do you start? The best place to begin is to understand what the concept is, how to implement it, and whether it’s the right approach for a given problem.

on Oct 14, 2019 in MathWorks, Reinforcement Learning
Choosing a Machine Learning Model

Selecting the perfect machine learning model is part art and part science. Learn how to review multiple models and pick the best in both competitive and real-world applications.

on Oct 14, 2019 in Interpretability, Kaggle, Machine Learning
Using Neural Networks to Design Neural Networks: The Definitive Guide to Understand Neural Architecture Search

A recent survey outlined the main neural architecture search methods used to automate the design of deep learning systems.

on Oct 14, 2019 in Architecture, Automated Machine Learning, Neural Networks
An Overview of Density Estimation

Density estimation is estimating the probability density function of the population from the sample. This post examines and compares a number of approaches to density estimation.

on Oct 14, 2019 in Generative Adversarial Network, Probability, Statistics
Beyond Word Embedding: Key Ideas in Document Embedding

This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.

on Oct 11, 2019 in LDA, NLP, Topic Modeling, Trends, Word Embeddings
8 Paths to Getting a Machine Learning Job Interview

While you may be focused on your performance during your next job interview, landing that interview can be just as hard. Check out these tips for finding and securing an interview for a machine learning job.

on Oct 10, 2019 in Advice, Career, Jobs, Machine Learning
Lemma, Lemma, Red Pyjama: Or, doing words with AI

If we want a machine learning model to be able to generalize these forms together, we need to map them to a shared representation. But when are two different words the same for our purposes? It depends.

on Oct 10, 2019 in AI, NLP, Text Analytics
Activation maps for deep learning models in a few lines of code

We illustrate how to show the activation maps of various layers in a deep CNN model with just a couple of lines of code.

on Oct 10, 2019 in Architecture, Deep Learning, Neural Networks, Python
Four questions to help accurately scope analytics engineering project

Being really good at scoping analytics projects is crucial for team productivity and profitability. You can consistently deliver on time if you work out the issue first, and these four questions can help you prepare.

on Oct 9, 2019 in Analytics, Data Engineering, dbt, Deployment
Introduction to Artificial Neural Networks

In this article, we’ll try to cover everything related to Artificial Neural Networks or ANN.

on Oct 8, 2019 in Beginners, Gradient Descent, Neural Networks
Know Your Data: Part 2

To build an effective learning model, it is must to understand the quality issues exist in data & how to detect and deal with it. In general, data quality issues are categories in four major sets.

on Oct 8, 2019 in Beginners, Data Preparation, Data Preprocessing, Datasets
The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral Data Visualization

As a data scientist, your most important skill is creating meaningful visualizations to disseminate knowledge and impact your organization or client. These seven principals will guide you toward developing charts with clarity, as exemplified with data from a recent KDnuggets poll.

on Oct 7, 2019 in Data Science, Data Science Skills, Data Visualization, Excel, Java, Python, Skills, TensorFlow
OpenAI Tried to Train AI Agents to Play Hide-And-Seek but Instead They Were Shocked by What They Learned

OpenAI trained agents in a simple game of hide-and-seek and learned many other different skills in the process.

on Oct 7, 2019 in AI, OpenAI, Reinforcement Learning
10 Free Top Notch Natural Language Processing Courses

Are you looking to learn natural language processing? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to learning NLP and its varied topics.

on Oct 7, 2019 in fast.ai, NLP, Oxford, spaCy, Stanford, U. of Washington, UC Berkeley, Yandex
The Last SQL Guide for Data Analysis You’ll Ever Need

This is it: the last SQL guide for data analysis you'll ever need! OK, maybe it’s actually the first. But it’ll give you a solid head start.

on Oct 4, 2019 in Cheat Sheet, Data Analysis, Data Science, SQL
5 Fundamental AI Principles

While AI may appear magical at times, these five principles will help guide you to avoid pitfalls when leveraging this tech.

on Oct 3, 2019 in AI, Data Cleaning, Deployment, Training Data
Recreating Imagination: DeepMind Builds Neural Networks that Spontaneously Replay Past Experiences

DeepMind researchers created a model to be able to replay past experiences in a way that simulate the mechanisms in the hippocampus.

on Oct 3, 2019 in DeepMind, Neural Networks
Data Preparation for Machine learning 101: Why it’s important and how to do it

As data scientists who are the brains behind the AI-based innovations, you need to understand the significance of data preparation to achieve the desired level of cognitive capability for your models. Let’s begin.

on Oct 2, 2019 in Data Preparation, Data Science, Machine Learning
Multi-Task Learning – ERNIE 2.0: State-of-the-Art NLP Architecture Intuitively Explained

The tech giant Baidu unveiled its state-of-the-art NLP architecture ERNIE 2.0 earlier this year, which scored significantly higher than XLNet and BERT on all tasks in the GLUE benchmark. This major breakthrough in NLP takes advantage of a new innovation called “Continual Incremental Multi-Task Learning”.

on Oct 2, 2019 in AISC, Architecture, Multitask Learning, NLP
A European Approach to Master’s Degrees in Data Science

Data science education in Europe has been reevaluated and new recommendations are leading the way to the next generation of data science Master's courses to better support and train students.

on Oct 1, 2019 in Education, Europe, Master of Science, MS in Analytics, MS in Data Science
Sentiment and Emotion Analysis for Beginners: Types and Challenges

There are three types of emotion AI, and their combinations. In this article, I’ll briefly go through these three types and the challenges of their real-life applications.

on Oct 1, 2019 in Beginners, Emotion, NLP, Sentiment Analysis
Clustering Metrics Better Than the Elbow Method

We show what metric to use for visualizing and determining an optimal number of clusters much better than the usual practice — elbow method.

on Oct 1, 2019 in Clustering, Metrics

2019 Oct

Latest Posts

Top Posts