2020 Sep
All (88) | Events (1) | News, Education (12) | Opinions (17) | Top Stories, Tweets (10) | Tutorials, Overviews (48)
- Top KDnuggets tweets, Sep 23-29: An Introduction to #AI – updated for 2020; Master using Pandas for time series analysis - Sep 30, 2020.
An Introduction to #AI - updated for 2020; Free From MIT: Intro to Computer Science and Programming in Python; The Most Complete Guide to #PyTorch for Data Scientists; (Good) Data Cleaning is just reusable Data Transformations
- AI in Healthcare: A review of innovative startups - Sep 30, 2020.
The AI innovation in healthcare has been overwhelming with the Global Healthcare AI Market accounting for $0.95 billion in 2017, and is expected to reach $19.25 billion by 2026. What drives this vibrant growth?
- Machine Learning Model Deployment - Sep 30, 2020.
Read this article on machine learning model deployment using serverless deployment. Serverless compute abstracts away provisioning, managing severs and configuring software, simplifying model deployment.
-
The Best Free Data Science eBooks: 2020 Update - Sep 30, 2020.
The author has updated their list of best free data science books for 2020. Read on to see what books you should grab. - Are Data Analytics and Data Science Two Separate Fields? - Sep 29, 2020.
How are the fields of Data Analytics and Data Science related? Read this post by John Thompson, author of the new Packt book "Building Analytics Teams" to gain an understanding of the link between the two.
- Missing Value Imputation – A Review - Sep 29, 2020.
Detecting and handling missing values in the correct way is important, as they can impact the results of the analysis, and there are algorithms that can’t handle them. So what is the correct way?
- International alternatives to Kaggle for Data Science / Machine Learning competitions - Sep 29, 2020.
While Kaggle might be the most well-known, go-to data science competition platform to test your skills at model building and performance, additional regional platforms are available around the world that offer even more opportunities to learn... and win.
- How AI is Driving Innovation in Astronomy - Sep 29, 2020.
In this blog, we look at a disruptive AI program - Morpheus - developed by Researchers at UC Santa Cruz that can analyze astronomical image data and classify galaxies and stars with surgical precision. If you're reading this with "starry" eyes, we bet we've got you hooked.
- Top Stories, Sep 21-27: Introduction to Time Series Analysis in Python; Machine Learning from Scratch: Free Online Textbook - Sep 28, 2020.
Also: How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy; I'm a Data Scientist, Not Just The Tiny Hands that Crunch your Data; New Poll: What Python IDE / Editor you used the most in 2020?; The Most Complete Guide to PyTorch for Data Scientists
- Looking Inside The Blackbox: How To Trick A Neural Network - Sep 28, 2020.
In this tutorial, I’ll show you how to use gradient ascent to figure out how to misclassify an input.
-
Geographical Plots with Python - Sep 28, 2020.
When your data includes geographical information, rich map visualizations can offer significant value for you to understand your data and for the end user when interpreting analytical results. - The Online Courses You Must Take to be a Better Data Scientist - Sep 28, 2020.
These select courses have proved to be precious online resources which helped make the author a better data scientist today.
- Making Python Programs Blazingly Fast - Sep 25, 2020.
Let’s look at the performance of our Python programs and see how to make them up to 30% faster!
- Create and Deploy your First Flask App using Python and Heroku - Sep 25, 2020.
Flask is a straightforward and lightweight web application framework for Python applications. This guide walks you through how to write an application using Flask with a deployment on Heroku.
- Causal Inference: The Free eBook - Sep 25, 2020.
Here's another free eBook for those looking to up their skills. If you are seeking a resource that exhaustively treats the topic of causal inference, this book has you covered.
- KDD 2020 Celebrates Recipients of the SIGKDD Best Paper Awards - Sep 24, 2020.
Top Data Scientists Honored for Advanced Research and Applied Data Science in the Field of Knowledge Discovery in Data and Data Mining.
-
Introduction to Time Series Analysis in Python - Sep 24, 2020.
Data that is updated in real-time requires additional handling and special care to prepare it for machine learning models. The important Python library, Pandas, can be used for most of this work, and this tutorial guides you through this process for analyzing time-series data. - The Most Complete Guide to PyTorch for Data Scientists - Sep 24, 2020.
All the PyTorch functionality you will ever need while doing Deep Learning. From an Experimentation/Research Perspective.
- Top KDnuggets tweets, Sep 16-22: An overview of 63 #MachineLearning algorithms - Sep 23, 2020.
Also: Online Certificates/Courses in #AI, #BusinessAnalytics, #DataScience, #MachineLearning from Top Universities; 24 Best (and #Free) #Books To Understand #MachineLearning; New Poll: What Python IDE / Editor you used the most in 2020?; Mathematics for #MachineLearning: The #Free eBook
- How well do you wear your “operationalizing analytics” hat? Take this simple quiz to find out. - Sep 23, 2020.
The first in a series of blogs by FICO’s Benjamin Baer introduces the role of decision management as a critical means to help data-driven insights drive your business decisions.
- LinkedIn’s Pro-ML Architecture Summarizes Best Practices for Building Machine Learning at Scale - Sep 23, 2020.
The reference architecture is powering mission critical machine learning workflows within LinkedIn.
-
How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy - Sep 23, 2020.
Data science work typically requires a big lift near the end to increase the accuracy of any model developed. These five recommendations will help improve your machine learning models and help your projects reach their target goals. - Artificial Intelligence for Precision Medicine and Better Healthcare - Sep 23, 2020.
In this article, we will focus on various machine learning, deep learning models, and applications of AI which can pave the way for a new data-centric era of discovery in healthcare.
- MathWorks Deep learning workflow: tips, tricks, and often forgotten steps - Sep 22, 2020.
Getting started in deep learning – and adopting an organized, sustainable, and reproducible workflow – can be challenging. This blog post will share some tips and tricks to help you develop a systematic, effective, attainable, and scalable deep learning workflow as you experiment with different deep learning models, datasets, and applications.
- New Poll: What Python IDE / Editor you used the most in 2020? - Sep 22, 2020.
The latest KDnuggets polls asks which Python IDE / Editor you have used the most in 2020. Participate now, and share your experiences with the community.
-
Machine Learning from Scratch: Free Online Textbook - Sep 22, 2020.
If you are looking for a machine learning starter that gets right to the core of the concepts and the implementation, then this new free textbook will help you dive in to ML engineering with ease. By focusing on the basics of the underlying algorithms, you will be quickly up and running with code you construct yourself. - The Potential of Predictive Analytics in Labor Industries - Sep 22, 2020.
Predictive analytics isn't just for white-collar work. Check out these five examples that show its potential in blue-collar jobs and industries as well.
- Statistical and Visual Exploratory Data Analysis with One Line of Code - Sep 21, 2020.
If EDA is not executed correctly, it can cause us to start modeling with “unclean” data. See how to use Pandas Profiling to perform EDA with a single line of code.
- I’m a Data Scientist, Not Just The Tiny Hands that Crunch your Data - Sep 21, 2020.
Not everyone "gets" the role of the Data Scientist -- including management. While there can be frustrating aspects of being a data scientist, there are effective ways to go about fixing them.
- Top Stories, Sep 14-20: Automating Every Aspect of Your Python Project; Deep Learning’s Most Important Ideas - Sep 21, 2020.
Also: Statistics with Julia: The Free eBook; Online Certificates/Courses in AI, Data Science, Machine Learning from Top Universities; Autograd: The Best Machine Learning Library You're Not Using?; Implementing a Deep Learning Library from Scratch in Python
- What an Argentine Writer and a Hungarian Mathematician Can Teach Us About Machine Learning Overfitting - Sep 21, 2020.
This article presents some beautiful ideas about intelligence and how they related to modern machine learning.
-
Automating Every Aspect of Your Python Project - Sep 18, 2020.
Every Python project can benefit from automation using Makefile, optimized Docker images, well configured CI/CD, Code Quality Tools and more… - What is Simpson’s Paradox and How to Automatically Detect it - Sep 18, 2020.
Looking at data one way can tell one story, but sometimes looking at it another way will tell the opposite story. Understanding this paradox and why it happens is essential, and new tools are available to help automatically detect this tricky issue in your datasets.
- The Insiders’ Guide to Generative and Discriminative Machine Learning Models - Sep 18, 2020.
In this article, we will look at the difference between generative and discriminative models, how they contrast, and one another.
- Coursera’s Machine Learning for Everyone Fulfills Unmet Training Needs - Sep 17, 2020.
Coursera's Machine Learning for Everyone (free access) fulfills two different kinds of unmet learner needs, for both the technology side and the business side, covering state-of-the-art techniques, business leadership best practices, and a wide range of common pitfalls and how to avoid them.
- How to Effectively Obtain Consumer Insights in a Data Overload Era - Sep 17, 2020.
Everybody knows how important is understanding your customer, but how to do that in an era of Information Overload?
- Unpopular Opinion – Data Scientists Should Be More End-to-End - Sep 17, 2020.
Can a do-it-all Data Scientist really be more effective at delivering new value from data? While it might sound exhausting, important efficiencies can exist that might bring better value to the business even faster.
-
Implementing a Deep Learning Library from Scratch in Python - Sep 17, 2020.
A beginner’s guide to understanding the fundamental building blocks of deep learning platforms. - Top KDnuggets tweets, Sep 9-15: Will You Enroll At #Google University For $49/Month? Here Are International Alternatives to @Kaggle - Sep 16, 2020.
Will You Enroll At #Google University For $49/Month? On @Kaggle some prizes are only for Americans - here are international alternatives; Advanced #NumPy for #DataScience; Free From MIT: Intro to Computer Science and Programming in Python
- Can Neural Networks Show Imagination? DeepMind Thinks They Can - Sep 16, 2020.
DeepMind has done some of the relevant work in the area of simulating imagination in deep learning systems.
-
Online Certificates/Courses in AI, Data Science, Machine Learning from Top Universities - Sep 16, 2020.
We present the online courses and certificates in AI, Data Science, Machine Learning, and related topics from the top 20 universities in the world. -
Autograd: The Best Machine Learning Library You’re Not Using? - Sep 16, 2020.
If there is a Python library that is emblematic of the simplicity, flexibility, and utility of differentiable programming it has to be Autograd. - The Maslow’s hierarchy your data science team needs - Sep 15, 2020.
Domino Data Lab was announced as a leader for the second year in a row in the recently released “Forrester Wave™: Notebook-based Predictive Analytics and Machine Learning (PAML), Q3 2020” analyst report. True to our data science roots, we’ve built a Maslow’s hierarchy of data science team needs.
- DIY Election Fraud Analysis Using Benford’s Law - Sep 15, 2020.
In this article, we will talk about a Do-It-Yourself approach towards election analysis and coming to a conclusion whether the elections were conducted fairly or not.
- Here’s what you need to look for in a model server to build ML-powered services - Sep 15, 2020.
More applications are being infused with machine learning while MLOps processes and best practices are becoming well established. Critical to these software and systems are the servers that run the models, which should feature key capabilities to drive successful enterprise-scale productionizing of machine learning.
- Visualization Of COVID-19 New Cases Over Time In Python - Sep 15, 2020.
Inspired by another concise data visualization, the author of this article has crafted and shared the code for a heatmap which visualizes the COVID-19 pandemic in the United States over time.
- Big Data and AI Toronto Goes Virtual - Sep 14, 2020.
The Big Data and AI Toronto Conference and Expo returns on September 29-30, 2020 with a brand new format and will be held exclusively online. KDnuggets readers get a 25% discount on all-access passes with promo code BDTORONTO-25. Register now.
- Lessons From My First Kaggle Competition - Sep 14, 2020.
How I chose my first Kaggle competition to enter and what I learned from doing it.
- Top Stories, Sep 7-13: Free From MIT: Intro to Computer Science and Programming in Python - Sep 14, 2020.
Also: Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills; AI Papers to Read in 2020; A Deep Learning Dream: Accuracy and Interpretability in a Single Model; Creating Powerful Animated Visualizations in Tableau; 8 AI/Machine Learning Projects To Make Your Portfolio Stand Out
-
Deep Learning’s Most Important Ideas - Sep 14, 2020.
In the field of deep learning, there continues to be a deluge of research and new papers published daily. Many well-adopted ideas that have stood the test of time provide the foundation for much of this new work. To better understand modern deep learning, these techniques cover the basic necessary knowledge, especially as a starting point if you are new to the field. -
Statistics with Julia: The Free eBook - Sep 14, 2020.
This free eBook is a draft copy of the upcoming Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence. Interested in learning Julia for data science? This might be the best intro out there. - Top August Stories: Know What Employers are Expecting for a Data Scientist Role in 2020; If I had to start learning Data Science again, how would I do it? - Sep 11, 2020.
Also: Netflix's Polynote is a New Open Source Framework to Build Better Data Science Notebooks; Must-read NLP and Deep Learning articles for Data Scientists.
- Understanding Bias-Variance Trade-Off in 3 Minutes - Sep 11, 2020.
This article is the write-up of a Machine Learning Lighting Talk, intuitively explaining an important data science concept in 3 minutes.
- Feature Engineering for Numerical Data - Sep 11, 2020.
Data feeds machine learning models, and the more the better, right? Well, sometimes numerical data isn't quite right for ingestion, so a variety of methods, detailed in this article, are available to transform raw numbers into something a bit more palatable.
- An Introduction to NLP and 5 Tips for Raising Your Game - Sep 11, 2020.
This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.
- Math for Programmers - Sep 10, 2020.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer. Save 50% with code kdmath50.
- AI Papers to Read in 2020 - Sep 10, 2020.
Reading suggestions to keep you up-to-date with the latest and classic breakthroughs in AI and Data Science.
- 6 Common Mistakes in Data Science and How To Avoid Them - Sep 10, 2020.
As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.
- Let’s Be Honest: We’re Drowning in Data - Sep 10, 2020.
The fields of Big Data, Data Analytics/Science, and Data Integration need to face a new truth: We are drowning in data, more and more so every second of every day.
- Top KDnuggets tweets, Sep 02-08: Training alone is never enough to generate effective supervised machine learning models - Sep 9, 2020.
Also: Tensorbook, a #DeepLearning laptop; 5 Concepts Every #dataScientist Should Know: Multicollinearity, encoding, sampling, error, and storytelling; Top 20 #Python AI and #MachineLearning #OpenSource Projects; How To Decide What Data Skills To Learn
- Seven Reasons to Take This Course Before You Go Hands-On with Machine Learning - Sep 9, 2020.
Eric Siegel's new course series on Coursera, Machine Learning for Everyone, is for any learner who wishes to participate in the business deployment of machine learning. This end-to-end, three-course series is accessible to business-level learners and yet vital to techies as well. It covers both the state-of-the-art techniques and the business-side best practices.
-
Free From MIT: Intro to Computer Science and Programming in Python - Sep 9, 2020.
This free introductory computer science and programming course is available via MIT's Open Courseware platform. It's a great resource for mastering the fundamentals of one of data science's major requirements. -
8 AI/Machine Learning Projects To Make Your Portfolio Stand Out - Sep 9, 2020.
If you are just starting down a path toward a career in Data Science, or you are already a seasoned practitioner, then keeping active to advance your experience through side projects is invaluable to take you to the next professional level. These eight interesting project ideas with source code and reference articles will jump start you to thinking outside of the box. - 4 Tools to Speed Up Your Data Science Writing - Sep 9, 2020.
This article covers how you can achieve your writing goals with these 4 tools.
- NIST $240K Challenge: Saving Lives, One Pixel at a Time - Sep 8, 2020.
Video analytics that could save lives and property are just out of reach. A new prize challenge, Enhancing Computer Vision for Public Safety, is designed to help develop a new line of research that will bring such tools closer to reality.
- What Does It Take to be a Successful Data Scientist? - Sep 8, 2020.
What is the right approach to earning your stripes and calling yourself a successful data scientist?
-
Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills - Sep 8, 2020.
We analyze the results of the Data Science Skills poll, including 8 categories of skills, 13 core skills that over 50% of respondents have, the emerging/hot skills that data scientists want to learn, and what is the top skill that Data Scientists want to learn. - 4 Tricks to Effectively Use JSON in Python - Sep 8, 2020.
Working with JSON in Python is a breeze, this will get you started right away.
- Scaling Your Data Compliance Strategy with Legal Automation - Sep 7, 2020.
Join Immuta and the COVID-19 Alliance, a non-profit organization of MIT, for this virtual workshop on Sep 23 @ 1 PM ET, to learn how you can use legal automation to easily scale your data analytics compliance strategy. Register now.
- Top Stories, Aug 31 – Sep 6: Top Online Masters in Analytics, Business Analytics, Data Science – Updated - Sep 7, 2020.
Also: How to Evaluate the Performance of Your Machine Learning Model; Which methods should be used for solving linear regression?; A Curious Theory About the Consciousness Debate in AI; If I had to start learning Data Science again, how would I do it?; PyCaret 2.1 is here: Whats new?
-
Creating Powerful Animated Visualizations in Tableau - Sep 7, 2020.
In this post we explore animated data visualization in Tableau,one of the tool's powerful features for making visualizations appealing and interactive. - 9 Developing Data Science & Analytics Job Trends - Sep 7, 2020.
With so much disruption in 2020 already, a recent report by Burtch Works looks ahead to next year and beyond, and shares insights about how today's hiring market trends may impact our work lives for years to come.
- A Deep Learning Dream: Accuracy and Interpretability in a Single Model - Sep 7, 2020.
IBM Research believes that you can improve the accuracy of interpretable models with knowledge learned in pre-trained models.
- How To Decide What Data Skills To Learn - Sep 4, 2020.
Read this article to learn about getting the most valuable skills on the job market.
- Data Scientists think data is their #1 problem. Here’s why they’re wrong. - Sep 4, 2020.
We tend to think it's all about the data. However, for real data science projects at real organizations in real life, there are more fundamental aspects to consider to do data science right.
- The Most Important Data Science Project - Sep 4, 2020.
What is the project every data scientist must do?
- Book Chapter: The Art of Statistics: Learning from Data - Sep 3, 2020.
Get a free book chapter from "The Art of Statistics: Learning from Data" by a leading researcher Sir David John Spiegelhalter. This excerpt takes a forensic look at data surrounding the victims of the UK most prolific serial killer and shows how a simple search for patterns reveals critical details.
- Design of Experiments in Data Science - Sep 3, 2020.
Read this overview of the process of designing experiments for collecting data.
-
How to Evaluate the Performance of Your Machine Learning Model - Sep 3, 2020.
You can train your supervised machine learning models all day long, but unless you evaluate its performance, you can never know if your model is useful. This detailed discussion reviews the various performance metrics you must consider, and offers intuitive explanations for what they mean and how they work. - 10 Things You Didn’t Know About Scikit-Learn - Sep 3, 2020.
Check out these 10 things you didn’t know about Scikit-Learn... until now.
- Top KDnuggets tweets, Aug 26 – Sep 01: A realistic look at the time spent in a life of a #DataScientist - Sep 2, 2020.
Also: Full Stack #DeepLearning course; Guide to Intelligent #DataScience book - updated/expanded content throughout, now an even better basis for the "educational" part of "becoming a successful data scientist"; Completely Free #MachineLearning Reading List by @vickdata; A Complete #DataScience Portfolio Project
- What Is Data Enrichment And How It Works - Sep 2, 2020.
Learn what is data enrichment, what are the different types, benefits and use cases for data enrichment, and how Smartproxy helps you do it.
- Computer Vision Recipes: Best Practices and Examples - Sep 2, 2020.
This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.
- Which methods should be used for solving linear regression? - Sep 2, 2020.
As a foundational set of algorithms in any machine learning toolbox, linear regression can be solved with a variety of approaches. Here, we discuss. with with code examples, four methods and demonstrate how they should be used.
- Here is What I’ve Learned in 2 Years as a Data Scientist - Sep 2, 2020.
In this article, for the first time, I’ll consolidate everything that I’ve learned and condense all of these into 5 lessons that I’ve learned in 2 years as a data scientist.
- PyCaret 2.1 is here: What’s new? - Sep 1, 2020.
PyCaret is an open-source, low-code machine learning library in Python that automates the machine learning workflow. It is an end-to-end machine learning and model management tool that speeds up the machine learning experiment cycle and makes you 10x more productive. Read about what's new in PyCaret 2.1.
-
Top Online Masters in Analytics, Business Analytics, Data Science – Updated - Sep 1, 2020.
We provide an updated list of best online Masters in AI, Analytics, and Data Science, including rankings, tuition, and duration of the education program. - Showcasing the Benefits of Software Optimizations for AI Workloads on Intel® Xeon® Scalable Platforms - Sep 1, 2020.
The focus of this blog is to bring to light that continued software optimizations can boost performance not only for the latest platforms, but also for the current install base from prior generations. This means customers can continue to extract value from their current platform investments.