All (86) | Events (2) | News, Education (7) | Opinions (19) | Top Stories, Tweets (10) | Tutorials, Overviews (48)
- Innovating versus Doing: NLP and CORD19 - Jun 30, 2020.
How I learned to trust the process and find value in the road most traveled.
- Stop training more models, start deploying them - Jun 30, 2020.
We are hardly living up to the promises of AI in healthcare. It’s not because of our training, it’s because of our deployment.
- Software engineering fundamentals for Data Scientists - Jun 30, 2020.
As a data scientist writing code for your models, it's quite possible that your work will make its way into a production environment to be used by the masses. But, writing code that is deployed as software is much different than writing code for exploratory data analysis. Learn about the key approaches for making your code production-ready that will save you time and future headaches.
- How to Prepare Your Data - Jun 30, 2020.
This is an overview of structuring, cleaning, and enriching raw data.
- Learn Data Science from Top Universities for Free - Jun 29, 2020.
Where to find free lectures, seminars and complete courses from the likes of MIT, Stanford and Harvard.
- The Unreasonable Progress of Deep Neural Networks in Natural Language Processing (NLP) - Jun 29, 2020.
Natural language processing has made incredible advances through advanced techniques in deep learning. Learn about these powerful models, and find how close (or far away) these approaches are to human-level understanding.
- An Introduction to Statistical Learning: The Free eBook - Jun 29, 2020.
This week's free eBook is a classic of data science, An Introduction to Statistical Learning, with Applications in R. If interested in picking up elementary statistical learning concepts, and learning how to implement them in R, this book is for you.
- Top Stories, Jun 22-28: How Much Math do you need in Data Science? - Jun 29, 2020.
Also: 4 Free Math Courses to do and Level up your Data Science Skills; Exploring the Real World of Data Science; Learning by Forgetting: Deep Neural Networks and the Jennifer Aniston Neuron; A TensorFlow Modeling Pipeline Using TensorFlow Datasets and TensorBoard
- Practical Markov Chain Monte Carlo - Jun 26, 2020.
This is a slightly more intricate example of MCMC, compared to many with a fairly simple model, a single predictor (maybe two), and not much else, which highlights a couple of issues and tricks worth noting for a handwritten implementation.
- How Much Math do you need in Data Science? - Jun 26, 2020.
There exist so many great computational tools available for Data Scientists to perform their work. However, mathematical skills are still essential in data science and machine learning because these tools will only be black-boxes for which you will not be able to ask core analytical questions without a theoretical foundation.
- Exploring the Real World of Data Science - Jun 26, 2020.
An article highlighting things I’ve learned in the real world about data science.
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
- Lynx Analytics is open-sourcing LynxKite, its Complete Graph Data Science Platform - Jun 25, 2020.
Check out this article for a brief summary on what LynxKite is, where it is coming from and how it can help with your data science projects.
- Learning by Forgetting: Deep Neural Networks and the Jennifer Aniston Neuron - Jun 25, 2020.
DeepMind’s research shows how to understand the role of individual neurons in a neural network.
- Machine Learning Engineer vs Data Scientist (Is Data Science Over?) - Jun 25, 2020.
What has been happening to the definition of Data Scientist over the past 5 years? Does it still exist or has it morphed into a new version of its old self? Learn more about the recent trends in job descriptions and salaries for data scientists, ML engineers, and others to best understand the best fit for your career trajectory and interests.
- Free Economics & Finance Courses for Data Scientists - Jun 25, 2020.
Here is a selection of courses for those interested in diversifying their domain knowledge into the related realms of economics and finance, with the goal of being able to apply your data science skills to these domains.
- Top KDnuggets tweets, Jun 17-23: The Best NLP with Deep Learning Course is Free - Jun 24, 2020.
Also: Speed up your Numpy and Pandas with NumExpr package; 5 Books That Will Teach You the Math Behind Machine Learning; How to Build a Data Science Web App in Python; Plotting in Pandas Just Got Prettier
- Five Lines of Code - Jun 24, 2020.
If you want to learn simple and practical rules for coding and refactoring, "Five Lines of Code" from Manning is the guide for you, teaching you concrete principles for refactoring. Save 40% with code nlfive40 until July 24.
- Build a Branded Web Based GIS Application Using R, Leaflet and Flexdashboard - Jun 24, 2020.
By using R, Flexdashboard and Leaflet, we can build a customized and branded web application to showcase location based data interactively across the organization. Instead of crowding the application with many widgets, we use menu tabs and pages to separate the interactive aspects.
- The 8 Basic Statistics Concepts for Data Science - Jun 24, 2020.
Understanding the fundamentals of statistics is a core capability for becoming a Data Scientist. Review these essential ideas that will be pervasive in your work and raise your expertise in the field.
- Time Complexity: How to measure the efficiency of algorithms - Jun 24, 2020.
When we consider the complexity of an algorithm, we shouldn’t really care about the exact number of operations that are performed; instead, we should care about how the number of operations relates to the problem size.
- A TensorFlow Modeling Pipeline Using TensorFlow Datasets and TensorBoard - Jun 23, 2020.
This article investigates TensorFlow components for building a toolset to make modeling evaluation more efficient. Specifically, TensorFlow Datasets (TFDS) and TensorBoard (TB) can be quite helpful in this task.
- Tools to Spot Deepfakes and AI-Generated Text - Jun 23, 2020.
The technologies that generate deepfake content is at the forefront of manipulating humans. While the research developing these algorithms is fascinating and will lead to powerful tools that enhance the way people create and work, in the wrong hands, these same tools drive misinformation at a scale we can't yet imagine. Stopping these bad actors using awesome tools is in your hands.
- Bias in AI: A Primer - Jun 23, 2020.
Those interested in studying AI bias, but who lack a starting point, would do well to check out this introductory set of slides and the accompanying talk on the subject from Google researcher Margaret Mitchell.
- Machine Learning in Dask - Jun 22, 2020.
In this piece, we’ll see how we can use Dask to work with large datasets on our local machines.
- 4 Free Math Courses to do and Level up your Data Science Skills - Jun 22, 2020.
Just as there is no Data Science without data, there's no science in data without mathematics. Strengthening your foundational skills in math will level you up as a data scientist that will enable you to perform with greater expertise.
- How to Deal with Missing Values in Your Dataset - Jun 22, 2020.
In this article, we are going to talk about how to identify and treat the missing values in the data step by step.
- Top Stories, Jun 15-21: Easy Speech-to-Text with Python; A Complete guide to Google Colab for Deep Learning - Jun 22, 2020.
Also: Uber's Ludwig is an Open Source Framework for Low-Code Machine Learning; Understanding Machine Learning: The Free eBook; Best Machine Learning Youtube Videos Under 10 Minutes; The Most Important Fundamentals of PyTorch you Should Know
- Graph Machine Learning in Genomic Prediction - Jun 19, 2020.
This work explores how genetic relationships can be exploited alongside genomic information to predict genetic traits with the aid of graph machine learning algorithms.
- What is emotion AI and why should you care? - Jun 19, 2020.
What is emotion AI, why is it relevant, and what do you need to know about it?
- modelStudio and The Grammar of Interactive Explanatory Model Analysis - Jun 19, 2020.
modelStudio is an R package that automates the exploration of ML models and allows for interactive examination. It works in a model agnostic fashion, therefore is compatible with most of the ML frameworks.
- 6 Easy Steps to Implement a Computer Vision Application Using Tensorflow.js - Jun 18, 2020.
In this article, we are going to see how we can implement computer vision applications using tensorflow.js models.
- The Most Important Fundamentals of PyTorch you Should Know - Jun 18, 2020.
PyTorch is a constantly developing deep learning framework with many exciting additions and features. We review its basic elements and show an example of building a simple Deep Neural Network (DNN) step-by-step.
- LightGBM: A Highly-Efficient Gradient Boosting Decision Tree - Jun 18, 2020.
LightGBM is a histogram-based algorithm which places continuous values into discrete bins, which leads to faster training and more efficient memory usage. In this piece, we’ll explore LightGBM in depth.
- Top KDnuggets tweets, Jun 10-16: A Crash Course in Game Theory for #MachineLearning: Classic and New Ideas - Jun 17, 2020.
Also: What is the Best #Python IDE for #DataScience?; Best Machine Learning Youtube Videos Under 10 Minutes - KDnuggets; Understanding Machine Learning: The Free eBook - KDnuggets; Math and Architectures of #DeepLearning; Introducing #MLOps, How to Scale #MachineLearning in the Enterprise
- Tom Fawcett, in memoriam - Jun 17, 2020.
Foster Provost in memoriam for Tom Fawcett, killed on June 4th in a freak bicycle accident. Tom was a brilliant scholar, a selfless collaborator, a substantial contributor to Data Science for three decades, and a unique individual.
- Build Dog Breeds Classifier Step By Step with AWS Sagemaker - Jun 17, 2020.
This post takes you through the basic steps for creating a cloud-based deep learning dog classifier, with everything accomplished from the AWS Management Console.
- A Classification Project in Machine Learning: a gentle step-by-step guide - Jun 17, 2020.
Classification is a core technique in the fields of data science and machine learning that is used to predict the categories to which data should belong. Follow this learning guide that demonstrates how to consider multiple classification models to predict data scrapped from the web.
- Crop Disease Detection Using Machine Learning and Computer Vision - Jun 17, 2020.
Computer vision has tremendous promise for improving crop monitoring at scale. We present our learnings from building such models for detecting stem and wheat rust in crops.
- Best Machine Learning Youtube Videos Under 10 Minutes - Jun 16, 2020.
The Youtube videos on this list cover concepts such as what machine learning is, the basics of natural language processing, how computer vision works, and machine learning in video games.
- A Complete guide to Google Colab for Deep Learning - Jun 16, 2020.
Google Colab is a widely popular cloud service for machine learning that features free access to GPU and TPU computing. Follow this detailed guide to help you get up and running fast to develop your next deep learning algorithms with Colab.
- Simplified Mixed Feature Type Preprocessing in Scikit-Learn with Pipelines - Jun 16, 2020.
There is a quick and easy way to perform preprocessing on mixed feature type data in Scikit-Learn, which can be integrated into your machine learning pipelines.
- Uber’s Ludwig is an Open Source Framework for Low-Code Machine Learning - Jun 15, 2020.
The new framework allow developers with minimum experience to create and train machine learning models.
- Free Data Analytics Courses - Jun 15, 2020.
Wherever your skills are at today, check out these top course recommendations for 2020 to help you master data analytics.
- Understanding Machine Learning: The Free eBook - Jun 15, 2020.
Time to get back to basics. This week we have a look at a book on foundational machine learning concepts, Understanding Machine Learning: From Theory to Algorithms.
- Top Stories, Jun 8-14: Easy Speech-to-Text with Python; Natural Language Processing with Python: The Free eBook - Jun 15, 2020.
Also: Five Cognitive Biases In Data Science (And how to avoid them); Deploy a Machine Learning Pipeline to the Cloud Using a Docker Container; Naive Bayes Algorithm: Everything you need to know; The Best NLP with Deep Learning Course is Free
- Deploy a Machine Learning Pipeline to the Cloud Using a Docker Container - Jun 12, 2020.
In this tutorial, we will use a previously-built machine learning pipeline and Flask app to demonstrate how to deploy a machine learning pipeline as a web app using the Microsoft Azure Web App Service.
- Five Cognitive Biases In Data Science (And how to avoid them) - Jun 12, 2020.
Everyone is prey to cognitive biases that skew thinking, but data scientists must prevent them from spoiling their work. Learn more about five biases that can all too easily make your seemingly objective work become surprisingly subjective.
- Top 6 Reasons Data Scientists Should Know Java - Jun 12, 2020.
There are many reasons why data scientists should learn Java. Read this overview of 6 specific reasons to help decide if Java might be right for your projects.
- Upgrading the Brand Mobile App with Machine Learning - Jun 11, 2020.
The tech progress in mobile app development, as well as digital enhancements, have created new chances for brands to allure and retain customers. In bridging the individualization gap, Machine Learning comes to the rescue.
- How to make AI/Machine Learning models resilient during COVID-19 crisis - Jun 11, 2020.
COVID-19-driven concept shift has created concern over the usage of AI/ML to continue to drive business value following cases of inaccurate outputs and misleading results from a variety of fields. Data Science teams must invest effort in post-model tracking and management as well as deploy an agility in the AI/ML process to curb problems related to concept shift.
- Math and Architectures of Deep Learning - Jun 11, 2020.
This hands-on book bridges the gap between theory and practice, showing you the math of deep learning algorithms side by side with an implementation in PyTorch. Save 40% off Math and Architectures of Deep Learning with code nlkdarch40
- Fighting Disease with Data: Q&A with Epidemiologist Amrish Baidjoe - Jun 11, 2020.
Data science tools are powerful for investigating the current pandemic and other outbreaks, when accurate and actionable data are crucial. Epidemiologist and R Epidemics Consortium leader Amrish Baidjoe shared his insights into using data science to fight disease, from modeling to automation to new technologies.
- Top KDnuggets tweets, Jun 3-9: The Best NLP with Deep Learning Course is Free - Jun 10, 2020.
Also How Much #Math do you need in #DataScience?; Deep Learning for Detecting Pneumonia from X-ray Images - KDnuggets; How to Speed up Pandas by 4x with one line of code.
- Easy Speech-to-Text with Python - Jun 10, 2020.
In this blog, I am demonstrating how to convert speech to text using Python. This can be done with the help of the “Speech Recognition” API and “PyAudio” library.
- Overview of data distributions - Jun 10, 2020.
With so many types of data distributions to consider in data science, how do you choose the right one to model your data? This guide will overview the most important distributions you should be familiar with in your work.
- Centroid Initialization Methods for k-means Clustering, by Matthew Mayo - Jun 10, 2020.
This article is the first in a series of articles looking at the different aspects of k-means clustering, beginning with a discussion on centroid initialization.
- Top May Stories: The Best NLP with Deep Learning Course is Free - Jun 10, 2020.
Also: How to Think Like a Data Scientist; Python For Everybody: The Free eBook; Automated Machine Learning: The Free eBook.
- Count, the data notebook everyone can use - Jun 9, 2020.
Dashboards have been the primary weapon of choice for distributing data over the last few decades, but they have brought with them a new set of problems. To increasingly democratise access to data we need to think again.
- New Poll: What was the largest dataset you analyzed / data mined? - Jun 9, 2020.
Take part in KDnuggets latest survey to have your voice heard, and let the community know what the largest dataset size you have worked with is.
- GPT-3, a giant step for Deep Learning and NLP? - Jun 9, 2020.
Recently, OpenAI announced a new successor to their language model, GPT-3, that is now the largest model trained so far with 175 billion parameters. Training a language model this large has its merits and limitations, so this article covers some of its most interesting and important aspects.
- 5 Essential Papers on Sentiment Analysis - Jun 9, 2020.
To highlight some of the work being done in the field, here are five essential papers on sentiment analysis and sentiment classification.
- Naïve Bayes Algorithm: Everything you need to know - Jun 8, 2020.
Naïve Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks. In this article, we will understand the Naïve Bayes algorithm and all essential concepts so that there is no room for doubts in understanding.
- Nitpicking Machine Learning Technical Debt - Jun 8, 2020.
Technical Debt in software development is pervasive. With machine learning engineering maturing, this classic trouble is unsurprisingly rearing its ugly head. These 25 best practices, first described in 2015 and promptly overshadowed by shiny new ML techniques, are updated for 2020 and ready for you to follow -- and lead the way to better ML code and processes in your organization.
- Natural Language Processing with Python: The Free eBook - Jun 8, 2020.
This free eBook is an introduction to natural language processing, and to NLTK, one of the most prevalent Python NLP libraries.
- Top Stories, Jun 1-7: Don’t Democratize Data Science; Deep Learning for Coders with fastai and PyTorch: The Free eBook - Jun 8, 2020.
Also: Deep Learning for Detecting Pneumonia from X-ray Images; From Languages to Information: Another Great NLP Course from Stanford; Skills to Build for Data Engineering; If you had to start statistics all over again, where would you start?
- Why Do AI Systems Need Human Intervention to Work Well? - Jun 5, 2020.
All is not well with artificial intelligence-based systems during the coronavirus pandemic. No, the virus does not impact AI – however, it does impact humans, without whom AI and ML systems cannot function properly. Surprised?
- If you had to start statistics all over again, where would you start? - Jun 5, 2020.
If you are just diving into learning statistics, then where do you begin? Find insight from those who have tread in these waters before, and see what they might have done differently along their personal journeys in statistics.
- Deep Learning for Detecting Pneumonia from X-ray Images - Jun 5, 2020.
This article covers an end to end pipeline for pneumonia detection from X-ray images.
- Metis Webinar: Deep Learning Approaches to Forecasting - Jun 4, 2020.
Metis Corporate Training is offering Deep Learning Approaches to Forecasting and Planning, a free webinar focusing on the intuition behind various deep learning approaches, and exploring how business leaders, data science managers, and decision makers can tackle highly complex models by asking the right questions, and evaluating the models with familiar tools.
- Upcoming Webinars and Online Events in AI, Data Science, Machine Learning: June - Jun 4, 2020.
Here are some interesting upcoming webinar, online events and virtual conferences in in AI, Data Science, and Machine Learning.
- Machine Learning Experiment Tracking - Jun 4, 2020.
Why is experiment tracking so important for doing real world machine learning?
- 5 Essential Papers on AI Training Data - Jun 4, 2020.
Data pre-processing is not only the largest time sink for most Data Scientists, but it is also the most crucial aspect of the work. Learn more about training data and data processing tasks from 5 leading academic papers.
- Skills to Build for Data Engineering - Jun 4, 2020.
This article jumps into the latest skill set observations in the Data Engineering Job Market which could definitely add a boost to your existing career or assist you in starting off your Data Engineering journey.
- Top KDnuggets tweets, May 27 – Jun 02: Deep Learning for Coders with fastai and PyTorch: The Free eBook - Jun 3, 2020.
Also: Machine Learning from First Principles; The Best NLP with Deep Learning Course is Free; Top Stories, May 25-31: Python For Everybody: The Free eBook; Interactive Machine Learning Experiments.
- Introduction to Convolutional Neural Networks - Jun 3, 2020.
The article focuses on explaining key components in CNN and its implementation using Keras python library.
- 3 Key Data Science Questions to Ask Your Big Data - Jun 3, 2020.
The process of understanding your data begins by asking 3 questions at the highest level, and then iteratively asking hundreds of cascading questions to get deeper insights.
- From Languages to Information: Another Great NLP Course from Stanford - Jun 3, 2020.
Check out another example of a Stanford NLP course and its freely available courseware.
- STIPS – Statistical Thinking for Industrial Problem Solving – A free online statistics course - Jun 2, 2020.
This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.
- Four Ways to Apply NLP in Financial Services - Jun 2, 2020.
Natural language processing (NLP) is increasingly used to review unstructured content or spot trends in markets. How is Refinitiv Labs applying NLP in financial services to meet challenges around investment decision-making and risk management?
- Don’t Democratize Data Science - Jun 2, 2020.
A plethora of online courses and tools promise to democratize the field, but just learning a few basic skills does not a true data scientist make.
- Docker: Containerization for Data Scientists - Jun 2, 2020.
This article is a simple explanation to containerization with Docker.
- Forecasting Stories 4: Time-series too, Causal too - Jun 1, 2020.
This article is about the story of taking effective business decisions basis a combined model. Let us together study how these components work hand in hand.
- Introduction to Pandas for Data Science - Jun 1, 2020.
The Pandas library is core to any Data Science work in Python. This introduction will walk you through the basics of data manipulating, and features many of Pandas important features.
- Deep Learning for Coders with fastai and PyTorch: The Free eBook - Jun 1, 2020.
If you are interested in a top-down, example-driven book on deep learning, check out the draft of the upcoming Deep Learning for Coders with fastai & PyTorch from fast.ai team.
- Top Stories, May 25-31: Python For Everybody: The Free eBook; Interactive Machine Learning Experiments - Jun 1, 2020.
Also: Dataset Splitting Best Practices in Python; 10 Useful Machine Learning Practices For Python Developers; How to Think Like a Data Scientist or Data Analyst; The Best NLP with Deep Learning Course is Free