2020 Nov Tutorials, Overviews
All (88) | Events (2) | News, Education (12) | Opinions (22) | Top Stories, Tweets (9) | Tutorials, Overviews (43)
- Deploying Trained Models to Production with TensorFlow Serving - Nov 30, 2020.
TensorFlow provides a way to move a trained model to a production environment for deployment with minimal effort. In this article, we’ll use a pre-trained model, save it, and serve it using TensorFlow Serving.
- Data Science History and Overview - Nov 30, 2020.
In this era of big data that is only getting bigger, a huge amount of information from different fields is gathered and stored. Its analysis and extraction of value have become one of the most attractive tasks for companies and society in general, which is harnessed by the new professional role of the Data Scientist.
- A Friendly Introduction to Graph Neural Networks - Nov 30, 2020.
Despite being what can be a confusing topic, graph neural networks can be distilled into just a handful of simple concepts. Read on to find out more.
-
Learn Deep Learning with this Free Course from Yann LeCun - Nov 27, 2020.
Here is a freely-available NYU course on deep learning to check out from Yann LeCun and Alfredo Canziani, including videos, slides, and other helpful resources. - Better data apps with Streamlit’s new layout options - Nov 26, 2020.
Introducing new layout primitives - including columns, containers and expanders!
- Essential Math for Data Science: Integrals And Area Under The Curve - Nov 25, 2020.
In this article, you’ll learn about integrals and the area under the curve using the practical data science example of the area under the ROC curve used to compare the performances of two machine learning models.
- How to Incorporate Tabular Data with HuggingFace Transformers - Nov 25, 2020.
In real-world scenarios, we often encounter data that includes text and tabular features. Leveraging the latest advances for transformers, effectively handling situations with both data structures can increase performance in your models.
- Simple Python Package for Comparing, Plotting & Evaluating Regression Models - Nov 25, 2020.
This package is aimed to help users plot the evaluation metric graph with single line code for different widely used regression model metrics comparing them at a glance. With this utility package, it also significantly lowers the barrier for the practitioners to evaluate the different machine learning algorithms in an amateur fashion by applying it to their everyday predictive regression problems.
-
TabPy: Combining Python and Tableau - Nov 24, 2020.
This article demonstrates how to get started using Python in Tableau. - Know-How to Learn Machine Learning Algorithms Effectively - Nov 23, 2020.
The takeaway from the story is that machine learning is way beyond a simple fit and predict methods. The author shares their approach to actually learning these algorithms beyond the surface.
- Computer Vision at Scale With Dask And PyTorch - Nov 23, 2020.
A tutorial on conducting image classification inference using the Resnet50 deep learning model at scale with using GPU clusters on Saturn Cloud. The results were: 40x faster computer vision that made a 3+ hour PyTorch model run in just 5 minutes.
- Adversarial Examples in Deep Learning – A Primer - Nov 20, 2020.
Bigger compute has led to increasingly impressive deep learning computer vision model SOTA results. However most of these SOTA deep learning models are brought down to their knees when making predictions on adversarial images. Read on to find out more.
- Cellular Automata in Stream Learning - Nov 20, 2020.
In this post, we will start presenting CA as pattern recognition methods for stream learning. Finally, we will briefly mention two recent CA-based solutions for stream learning. Both are highly interpretable as their cellular structure represents directly the mapping between the feature space and the labels to be predicted.
-
AI and Automation meets BI - Nov 19, 2020.
Organizations use a variety of BI tools to analyze structured data. These tools are used for ad-hoc analysis, and for dashboards and reports that are essential for decision making. In this post, we describe a new set of BI tools that continue this trend. - Kubernetes vs. Amazon ECS for Data Scientists - Nov 19, 2020.
In this article, we’ll look at two container management solutions — Kubernetes and Amazon Elastic Container Service (ECS) — from a perspective that makes sense for aspiring and current data scientists.
- Hypothesis Vetting: The Most Important Skill Every Successful Data Scientist Needs - Nov 18, 2020.
A well-thought hypothesis sets the direction and plan for a Data Science project. Accordingly, a hypothesis is the most important item for evaluating whether a Data Science project will be successful.
- 5 Most Useful Machine Learning Tools every lazy full-stack data scientist should use - Nov 18, 2020.
If you consider yourself a Data Scientist who can take any project from data curation to solution deployment, then you know there are many tools available today to help you get the job done. The trouble is that there are too many choices. Here is a review of five sets of tools that should turn you into the most efficient full-stack data scientist possible.
- How to Future-Proof Your Data Science Project - Nov 18, 2020.
This article outlines 5 critical elements of ML model selection & deployment.
-
Facebook Open Sourced New Frameworks to Advance Deep Learning Research - Nov 17, 2020.
Polygames, PyTorch3D and HiPlot are the new additions to Facebook’s open source deep learning stack. - Algorithms for Advanced Hyper-Parameter Optimization/Tuning - Nov 17, 2020.
In informed search, each iteration learns from the last, whereas in Grid and Random, modelling is all done at once and then the best is picked. In case for small datasets, GridSearch or RandomSearch would be fast and sufficient. AutoML approaches provide a neat solution to properly select the required hyperparameters that improve the model’s performance.
- 5 Things You Are Doing Wrong in PyCaret - Nov 16, 2020.
PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient. Find out 5 ways to improve your usage of the library.
-
Top Python Libraries for Deep Learning, Natural Language Processing & Computer Vision - Nov 16, 2020.
This article compiles the 30 top Python libraries for deep learning, natural language processing & computer vision, as best determined by KDnuggets staff. -
From Y=X to Building a Complete Artificial Neural Network - Nov 13, 2020.
In this tutorial, we will start with the most simple artificial neural network (ANN) and move to something much more complex. We begin by building a machine learning model with no parameters—which is Y=X. - tensorflow + dalex = :) , or how to explain a TensorFlow model - Nov 13, 2020.
Having a machine learning model that generates interesting predictions is one thing. Understanding why it makes these predictions is another. For a tensorflow predictive model, it can be straightforward and convenient develop an explainable AI by leveraging the dalex Python package.
-
How to Acquire the Most Wanted Data Science Skills - Nov 13, 2020.
We recently surveyed KDnuggets readers to determine the "most wanted" data science skills. Since they seem to be those most in demand from practitioners, here is a collection of resources for getting started with this learning. - 5 Tricky SQL Queries Solved - Nov 12, 2020.
Explaining the approach to solving a few complex SQL queries.
-
Do’s and Don’ts of Analyzing Time Series - Nov 12, 2020.
When handling time series data in your Data Science analysis work, a variety of common mistakes are made that are basic, but very important, to the processing of this type of data. Here, we review these issues and recommend the best practices. - Free From MIT: Intro to Computational Thinking with Julia - Nov 12, 2020.
Introduction to Computational Thinking with Julia, with Applications to Modeling the COVID-19 Pandemic is another freely-available offering from MIT's Open Courseware.
- Most Popular Distance Metrics Used in KNN and When to Use Them - Nov 11, 2020.
For calculating distances KNN uses a distance metric from the list of available metrics. Read this article for an overview of these metrics, and when they should be considered for use.
-
Learn to build an end to end data science project - Nov 11, 2020.
Appreciating the process you must work through for any Data Science project is valuable before you land your first job in this field. With a well-honed strategy, such as the one outlined in this example project, you will remain productive and consistently deliver valuable machine learning models. - Mastering TensorFlow Tensors in 5 Easy Steps - Nov 11, 2020.
Discover how the building blocks of TensorFlow works at the lower level and learn how to make the most of Tensor objects.
-
Every Complex DataFrame Manipulation, Explained & Visualized Intuitively - Nov 10, 2020.
Most Data Scientists might hail the power of Pandas for data preparation, but many may not be capable of leveraging all that power. Manipulating data frames can quickly become a complex task, so eight of these techniques within Pandas are presented with an explanation, visualization, code, and tricks to remember how to do it. - Change the Background of Any Image with 5 Lines of Code - Nov 9, 2020.
Blur, color, grayscale and change the background of any image with a picture using PixelLib.
-
Pandas on Steroids: End to End Data Science in Python with Dask - Nov 6, 2020.
End to end parallelized data science from reading big data to data manipulation to visualisation to machine learning. - How to Build a Football Dataset with Web Scraping - Nov 5, 2020.
This article covers using Selenium to scrape JavaScript rendered content.
-
Top 5 Free Machine Learning and Deep Learning eBooks Everyone should read - Nov 5, 2020.
There is always so much new to learn in machine learning, and keeping well grounded in the fundamentals will help you stay up-to-date with the latest advancements while acing your career in Data Science. - How to deploy PyTorch Lightning models to production - Nov 5, 2020.
A complete guide to serving PyTorch Lightning models at scale.
- Interpretability, Explainability, and Machine Learning – What Data Scientists Need to Know - Nov 4, 2020.
The terms “interpretability,” “explainability” and “black box” are tossed about a lot in the context of machine learning, but what do they really mean, and why do they matter?
- Building Deep Learning Projects with fastai — From Model Training to Deployment - Nov 4, 2020.
A getting started guide to develop computer vision application with fastai.
- 10 Principles of Practical Statistical Reasoning - Nov 3, 2020.
Practical Statistical Reasoning is a term that covers the nature and objective of applied statistics/data science, principles common to all applications, and practical steps/questions for better conclusions. The following principles have helped me become more efficient with my analyses and clearer in my conclusions.
- Topic Modeling with BERT - Nov 3, 2020.
Leveraging BERT and TF-IDF to create easily interpretable topics.
- Microsoft and Google Open Sourced These Frameworks Based on Their Work Scaling Deep Learning Training - Nov 2, 2020.
Google and Microsoft have recently released new frameworks for distributed deep learning training.
-
Top Python Libraries for Data Science, Data Visualization & Machine Learning - Nov 2, 2020.
This article compiles the 38 top Python libraries for data science, data visualization & machine learning, as best determined by KDnuggets staff.