2020 Feb

All (54) | Events (1) | News, Education (2) | Opinions (11) | Tutorials, Overviews (40)

Can Edge Analytics Become a Game Changer?

Edge analytics is considered to be the future of sensor handling, and this article discusses its benefits and architecture of modern edge devices, gateways, and sensors. Deep Learning for edge analytics is also considered along with a review of experiments in human and chess figure detection using edge devices.

on Feb 28, 2020 in Edge Analytics, IoT, Sciforce
The Big Bad NLP Database: Access Nearly 300 Datasets

Check out this database of nearly 300 freely-accessible NLP datasets, curated from around the internet.

on Feb 28, 2020 in Datasets, NLP, Text Mining
Hands on Hyperparameter Tuning with Keras Tuner

Or how hyperparameter tuning with Keras Tuner can boost your object classification network's accuracy by 10%.

on Feb 28, 2020 in Automated Machine Learning, AutoML, Keras, Python
Decision Tree Intuition: From Concept to Application

While the use of Decision Trees in machine learning has been around for awhile, the technique remains powerful and popular. This guide first provides an introductory understanding of the method and then shows you how to construct a decision tree, calculate important analysis parameters, and plot the resulting tree.

on Feb 27, 2020 in Beginners, Decision Trees, Machine Learning
Introducing fastpages: An easy to use blogging platform with extra features for Jupyter Notebooks

This article introduces the easy to use blogging platform fastpages. fastpages relies on Github pages for hosting, and Github Actions to automate the creation of your blog, and contains extra features for Jupyter Notebooks.

on Feb 27, 2020 in Blogs, fast.ai, Jupyter, Programming
Data Science Curriculum for self-study

Are you asking the question, "how do I become a Data Scientist?" This list recommends the best essential topics to gain an introductory understanding for getting started in the field. After learning these basics, keep in mind that doing real data science projects through internships or competitions is crucial to acquiring the core skills necessary for the job.

on Feb 26, 2020 in Advice, Data Science, Data Science Education, Data Visualization, Mathematics, Probability, Programming, Statistics
Image Recognition and Object Detection in Retail

“According to Gartner, by 2020, 85% of customer interactions in the retail industry will be managed by AI.”

on Feb 26, 2020 in Image Recognition, Object Detection, Retail
Python and R Courses for Data Science

Since Python and R are a must for today's data scientists, continuous learning is paramount. Online courses are arguably the best and most flexible way to upskill throughout ones career.

on Feb 26, 2020 in Coursera, Data Science, edX, MOOC, Programming, Python, R
Probability Distributions in Data Science

Some machine learning models are designed to work best under some distribution assumptions. Therefore, knowing with which distributions we are working with can help us to identify which models are best to use.

on Feb 26, 2020 in Data Science, Distribution, Normal Distribution, Probability
Free Mathematics Courses for Data Science & Machine Learning

It's no secret that mathematics is the foundation of data science. Here are a selection of courses to help increase your maths skills to excel in data science, machine learning, and beyond.

on Feb 25, 2020 in Courses, Data Science, Machine Learning, Mathematics, MOOC
Audio Data Analysis Using Deep Learning with Python (Part 2)

This is a followup to the first article in this series. Once you are comfortable with the concepts explained in that article, you can come back and continue with this.

on Feb 25, 2020 in Audio, Data Preprocessing, Deep Learning, Python
Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms

The Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms has the largest number of leaders ever. We examine the leaders and changes and trends vs previous years.

on Feb 24, 2020 in Alteryx, Data Science Platform, Databricks, Dataiku, DataRobot, Domino, Gartner, Google, H2O, IBM, Knime, Machine Learning, Magic Quadrant, MathWorks, Microsoft Azure, RapidMiner, SAS, TIBCO
Microsoft Open Sources ZeRO and DeepSpeed: The Technologies Behind the Biggest Language Model in History

The two efforts enable the training of deep learning models at massive scale.

on Feb 24, 2020 in Microsoft, NLP
Prepare for a Long Battle against Deepfakes

While deepfakes threaten to destroy our perception of reality, the tech giants are throwing down the gauntlet and working to enhance the state of the art in combating doctored videos and images.

on Feb 21, 2020 in AI, Crime, Deepfakes, Facebook, Google, Politics, Trends, Twitter
Passive Data Collection and Actionable Results: What to Know

There are plenty of ways to get actionable results by using passive data. However, such an outcome will not happen without careful forethought. Data analysts must consider several crucial specifics, including what questions they want and expect the information to answer, and how they'll apply the findings to aid the business.

on Feb 21, 2020 in Analytics, Customer Analytics, Data Curation, Datasets
How Kubeflow Can Add AI to Your Kubernetes Deployments

As Kubernetes is capable of working with other solutions, it is possible to integrate it with a collection of tools that can almost fully automate your development pipeline. Some of those third-party tools even allow you to integrate AI into Kubernetes. One such tool you can integrate with Kubernetes is Kubeflow. Read more about it here.

on Feb 21, 2020 in AI, Deployment, Kubernetes
The Death of Data Scientists – will AutoML replace them?

Soon after tech giants Google and Microsoft introduced their AutoML services to the world, the popularity and interest in these services skyrocketed. We first review AutoML, compare the platforms available, and then test them out against real data scientists to answer the question: will AutoML replace us?

on Feb 20, 2020 in AutoML, Data Scientist, Trends
The Forgotten Algorithm

This article explores Monte Carlo Simulation with Streamlit.

on Feb 20, 2020 in Monte Carlo, Python, Simulation, Streamlit
Hand labeling is the past. The future is #NoLabel AI

Data labeling is so hot right now… but could this rapidly emerging market face disruption from a small team at Stanford and the Snorkel open source project, which enables highly efficient programmatic labeling that is 10 to 1,000x as efficient as hand labeling?

on Feb 19, 2020 in AI, Data Labeling, Data Preparation, Training Data
Getting Started with R Programming

An end to end Data Analysis using R, the second most requested programming language in Data Science.

on Feb 19, 2020 in Data Science, Machine Learning, Programming, R
Audio Data Analysis Using Deep Learning with Python (Part 1)

A brief introduction to audio data processing and genre classification using Neural Networks and python.

on Feb 19, 2020 in Audio, Data Processing, Deep Learning, Python
20 AI, Data Science, Machine Learning Terms You Need to Know in 2020 (Part 1)

2020 is well underway, and we bring you 20 AI, data science, and machine learning terms we should all be familiar with as the year marches onward.

on Feb 18, 2020 in AI, Data Science, Key Terms, Machine Learning
Using the Fitbit Web API with Python

Fitbit provides a Web API for accessing data from Fitbit activity trackers. Check out this updated tutorial to accessing this Fitbit data using the API with Python.

on Feb 18, 2020 in API, Fitness, Health, Python
Scaling the Wall Between Data Scientist and Data Engineer

The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.

on Feb 17, 2020 in Advice, Data Engineer, Data Engineering, Data Scientist, Deployment, DevOps, Machine Learning Engineer, MLflow, MLOps, Production
Using AI to Identify Wildlife in Camera Trap Images from the Serengeti

With recent developments in machine learning and computer vision, we acquired the tools to provide the biodiversity community with an ability to tap the potential of the knowledge generated automatically with systems triggered by a combination of heat and motion.

on Feb 17, 2020 in Africa, AI, Computer Vision, Machine Learning
Inside The Machine Learning that Google Used to Build Meena: A Chatbot that Can Chat About Anything

Meena is one of the major milestones in the history of NLU. How did Google build it?

on Feb 17, 2020 in Chat, Chatbot, Google, Machine Learning, NLU
Deep Neural Networks

We examine the features and applications of a deep neural network.

on Feb 14, 2020 in Applications, Deep Learning, Neural Networks, Robots
What Does it Mean to Deploy a Machine Learning Model?

You are a Data Scientist who knows how to develop machine learning models. You might also be a Data Scientist who is too afraid to ask how to deploy your machine learning models. The answer isn't entirely straightforward, and so is a major pain point of the community. This article will help you take a step in the right direction for production deployments that are automated, reproducible, and auditable.

on Feb 14, 2020 in Deployment, Machine Learning, MLOps
Fourier Transformation for a Data Scientist

The article contains a brief intro into Fourier transformation mathematically and its applications in AI.

on Feb 14, 2020 in Data Science, Data Scientist, Mathematics, Python
Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau

When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day

on Feb 14, 2020 in Crime, Geospatial, R, SQL, Tableau, Time Series
Adversarial Validation Overview

Learn how to implement adversarial validation that builds a classifier to determine if your data is from the training or testing sets. If you can do this, then your data has issues, and your adversarial validation model can help you diagnose the problem.

on Feb 13, 2020 in Adversarial, Kaggle, Machine Learning, Python, Validation
Practical Hyperparameter Optimization

An introduction on how to fine-tune Machine and Deep Learning models using techniques such as: Random Search, Automated Hyperparameter Tuning and Artificial Neural Networks Tuning.

on Feb 13, 2020 in Automated Machine Learning, AutoML, Deep Learning, Hyperparameter, Machine Learning, Optimization, Python, scikit-learn
Easy Image Dataset Augmentation with TensorFlow

What can we do when we don't have a substantial amount of varied training data? This is a quick intro to using data augmentation in TensorFlow to perform in-memory image transformations during model training to help overcome this data impediment.

on Feb 13, 2020 in Data Preprocessing, Image Processing, Image Recognition, Python, TensorFlow
Math for Programmers – your guide for solving math problems in code

Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer.

on Feb 12, 2020 in Book, Manning, Mathematics, Programming
Why Did I Reject a Data Scientist Job?

Snagging that job as a Data Scientist might not be exactly what you were expecting. Consider this advice on carefully considering job titles with what the position might really be like day-to-day.

on Feb 12, 2020 in Advice, Career, Data Scientist
Sharing your machine learning models through a common API

DEEPaaS API is a software component developed to expose machine learning models through a REST API. In this article we describe how to do it.

on Feb 12, 2020 in API, Deep Learning, Machine Learning, Open Source, Python
Illustrating the Reformer

In this post, we will try to dive into the Reformer model and try to understand it with some visual guides.

on Feb 12, 2020 in NLP, Reformer, Transformer
Fidelity on How to Find a Tailor-Fit Unicorn Data Scientist

Predictive Analytics World for Financial Services in Las Vegas, May 31-Jun 4 is honored to host an exceptional keynote by Fidelity Investments’ AI and Data Science Center of Excellence Leader, Victor Lo: "How to Find a Tailor-Fit 'Unicorn' Data Scientist for Financial Services". Use the code KDNUGGETS for a 15% discount on your Predictive Analytics World ticket.

on Feb 11, 2020 in Data Science, PAW, Predictive Analytics World, Unicorn
How to learn data science on your own: a practical guide

While much focus today is on the rise in working from home and the challenges experienced, not as much is said about learning from home. For those lone wolfs studying Data Science in a self-directed way, a range of issues can get in the way of your goal. Learn about these common problems to prepare to focus yourself all the way to your educational goals.

on Feb 11, 2020 in Advice, Beginners, Data Science Education, MOOC
Basics of Audio File Processing in R

This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.

on Feb 11, 2020 in Audio, Data Processing, R
Recommender System Metrics: Comparing Apples, Oranges and Bananas

This article will discuss a sometimes-overlooked aspect of what distinguishes recommender systems from other machine learning tasks: added uncertainties of measuring them.

on Feb 11, 2020 in Metrics, Recommendation Engine, Recommender Systems
Observability for Data Engineering

Going beyond traditional monitoring techniques and goals, understanding if a system is working as intended requires a new concept in DevOps, called Observability. Learn more about this essential approach to bring more context to your system metrics.

on Feb 10, 2020 in Data Engineering, DevOps, Explainability, KPI, Monitoring, Time Series
Intent Recognition with BERT using Keras and TensorFlow 2

TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!

on Feb 10, 2020 in BERT, Keras, NLP, Python, TensorFlow
Amazon Uses Self-Learning to Teach Alexa to Correct its Own Mistakes

The digital assistant incorporates a reformulation engine that can learn to correct responses in real time based on customer interactions.

on Feb 10, 2020 in Alexa, Amazon, Learning
AI and Machine Learning In Our Every Day Life

The curiosity and buzz around the most talked-about technology -- Artificial Intelligence -- have experts and technophiles busy decoding its exciting future applications. Of course, the use of AI and machine learning is already pervasive in our daily lives, as we review many of these popular features in this article.

on Feb 7, 2020 in AI, Fraud Detection, Gmail, Machine Learning, Search, Social Media, Travel
The Data Science Puzzle — 2020 Edition

The data science puzzle is once again re-examined through the relationship between several key concepts of the landscape, incorporating updates and observations since last time. Check out the results here.

on Feb 7, 2020 in AI, Big Data, Data Mining, Data Science, Deep Learning, Machine Learning
Understanding Density-based Clustering

HDBSCAN is a robust clustering algorithm that is very useful for data exploration, and this comprehensive introduction provides an overview of its fundamental ideas from a high-level view above the trees to down in the weeds.

on Feb 6, 2020 in Clustering, DBSCAN, K-means, Segmentation
Getting up and Running with Python: Installing Anaconda on Windows

This tutorial covers how to download and install Anaconda on Windows; how to test your installation; how to fix common installation issues; and what to do after installing Anaconda.

on Feb 6, 2020 in Anaconda, Python
Intro to Machine Learning and AI based on high school knowledge

Machine learning information is becoming pervasive in the media as well as a core skill in new, important job sectors. Getting started in the field can require learning complex concepts, and this article outlines an approach on how to begin learning about these exciting topics based on high school knowledge.

on Feb 5, 2020 in AI, Beginners, Linear Regression, Machine Learning, Mathematics
Create Your Own Computer Vision Sandbox

This post covers a wide array of computer vision tasks, from automated data collection to CNN model building.

on Feb 5, 2020 in Computer Vision, Convolutional Neural Networks, Python
Optimal Estimation Algorithms: Kalman and Particle Filters

An introduction to the Kalman and Particle Filters and their applications in fields such as Robotics and Reinforcement Learning.

on Feb 5, 2020 in Kalman Filters, Machine Learning, Probability
Audio File Processing: ECG Audio Using Python

In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.

on Feb 4, 2020 in Audio, Data Processing, Health, Python
Serverless Machine Learning with R on Cloud Run

Expedite the deployment of your machine models using serverless cloud infrastructure. In this tutorial, we explore creating and deploying a model which scraps real time Twitter data and returns interactive visualization using R.

on Feb 4, 2020 in Cloud, Machine Learning, R, Twitter
Why are Machine Learning Projects so Hard to Manage?

What makes deploying a machine learning project so difficult? Is it the expectations? The people? The tech? There are common threads to these challenges, and best practices exist to deal with them.

on Feb 3, 2020 in Deployment, Kaggle, Lukas Biewald, Machine Learning, Project Fail, Training Data

2020 Feb

Latest Posts

Top Posts