News / Blog
- Making Python Programs Blazingly Fast - Sep 25, 2020.
Let’s look at the performance of our Python programs and see how to make them up to 30% faster!
- Create and Deploy your First Flask App using Python and Heroku - Sep 25, 2020.
Flask is a straightforward and lightweight web application framework for Python applications. This guide walks you through how to write an application using Flask with a deployment on Heroku.
- Causal Inference: The Free eBook - Sep 25, 2020.
Here's another free eBook for those looking to up their skills. If you are seeking a resource that exhaustively treats the topic of causal inference, this book has you covered.
- KDD 2020 Celebrates Recipients of the SIGKDD Best Paper Awards - Sep 24, 2020.
Top Data Scientists Honored for Advanced Research and Applied Data Science in the Field of Knowledge Discovery in Data and Data Mining.
- Introduction to Time Series Analysis in Python - Sep 24, 2020.
Data that is updated in real-time requires additional handling and special care to prepare it for machine learning models. The important Python library, Pandas, can be used for most of this work, and this tutorial guides you through this process for analyzing time-series data.
- The Most Complete Guide to PyTorch for Data Scientists - Sep 24, 2020.
All the PyTorch functionality you will ever need while doing Deep Learning. From an Experimentation/Research Perspective.
- Top KDnuggets tweets, Sep 16-22: An overview of 63 #MachineLearning algorithms - Sep 23, 2020.
Also: Online Certificates/Courses in #AI, #BusinessAnalytics, #DataScience, #MachineLearning from Top Universities; 24 Best (and #Free) #Books To Understand #MachineLearning; New Poll: What Python IDE / Editor you used the most in 2020?; Mathematics for #MachineLearning: The #Free eBook
- How well do you wear your “operationalizing analytics” hat? Take this simple quiz to find out. - Sep 23, 2020.
The first in a series of blogs by FICO’s Benjamin Baer introduces the role of decision management as a critical means to help data-driven insights drive your business decisions.
- LinkedIn’s Pro-ML Architecture Summarizes Best Practices for Building Machine Learning at Scale - Sep 23, 2020.
The reference architecture is powering mission critical machine learning workflows within LinkedIn.
- How I Consistently Improve My Machine Learning Models From 80% to Over 90% Accuracy - Sep 23, 2020.
Data science work typically requires a big lift near the end to increase the accuracy of any model developed. These five recommendations will help improve your machine learning models and help your projects reach their target goals.
- Artificial Intelligence for Precision Medicine and Better Healthcare - Sep 23, 2020.
In this article, we will focus on various machine learning, deep learning models, and applications of AI which can pave the way for a new data-centric era of discovery in healthcare.
- KDnuggets™ News 20:n36, Sep 23: New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project - Sep 23, 2020.
New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project; Autograd: The Best Machine Learning Library You're Not Using?; Implementing a Deep Learning Library from Scratch in Python; Online Certificates/Courses in AI, Data Science, Machine Learning; Can Neural Networks Show Imagination?
- MathWorks Deep learning workflow: tips, tricks, and often forgotten steps - Sep 22, 2020.
Getting started in deep learning – and adopting an organized, sustainable, and reproducible workflow – can be challenging. This blog post will share some tips and tricks to help you develop a systematic, effective, attainable, and scalable deep learning workflow as you experiment with different deep learning models, datasets, and applications.
- New Poll: What Python IDE / Editor you used the most in 2020? - Sep 22, 2020.
The latest KDnuggets polls asks which Python IDE / Editor you have used the most in 2020. Participate now, and share your experiences with the community.
- Machine Learning from Scratch: Free Online Textbook - Sep 22, 2020.
If you are looking for a machine learning starter that gets right to the core of the concepts and the implementation, then this new free textbook will help you dive in to ML engineering with ease. By focusing on the basics of the underlying algorithms, you will be quickly up and running with code you construct yourself.
- The Potential of Predictive Analytics in Labor Industries - Sep 22, 2020.
Predictive analytics isn't just for white-collar work. Check out these five examples that show its potential in blue-collar jobs and industries as well.
- Statistical and Visual Exploratory Data Analysis with One Line of Code - Sep 21, 2020.
If EDA is not executed correctly, it can cause us to start modeling with “unclean” data. See how to use Pandas Profiling to perform EDA with a single line of code.
- I’m a Data Scientist, Not Just The Tiny Hands that Crunch your Data - Sep 21, 2020.
Not everyone "gets" the role of the Data Scientist -- including management. While there can be frustrating aspects of being a data scientist, there are effective ways to go about fixing them.
- Top Stories, Sep 14-20: Automating Every Aspect of Your Python Project; Deep Learning’s Most Important Ideas - Sep 21, 2020.
Also: Statistics with Julia: The Free eBook; Online Certificates/Courses in AI, Data Science, Machine Learning from Top Universities; Autograd: The Best Machine Learning Library You're Not Using?; Implementing a Deep Learning Library from Scratch in Python
- What an Argentine Writer and a Hungarian Mathematician Can Teach Us About Machine Learning Overfitting - Sep 21, 2020.
This article presents some beautiful ideas about intelligence and how they related to modern machine learning.
- Automating Every Aspect of Your Python Project - Sep 18, 2020.
Every Python project can benefit from automation using Makefile, optimized Docker images, well configured CI/CD, Code Quality Tools and more…
- What is Simpson’s Paradox and How to Automatically Detect it - Sep 18, 2020.
Looking at data one way can tell one story, but sometimes looking at it another way will tell the opposite story. Understanding this paradox and why it happens is essential, and new tools are available to help automatically detect this tricky issue in your datasets.
- The Insiders’ Guide to Generative and Discriminative Machine Learning Models - Sep 18, 2020.
In this article, we will look at the difference between generative and discriminative models, how they contrast, and one another.
- Coursera’s Machine Learning for Everyone Fulfills Unmet Training Needs - Sep 17, 2020.
Coursera's Machine Learning for Everyone (free access) fulfills two different kinds of unmet learner needs, for both the technology side and the business side, covering state-of-the-art techniques, business leadership best practices, and a wide range of common pitfalls and how to avoid them.
- How to Effectively Obtain Consumer Insights in a Data Overload Era - Sep 17, 2020.
Everybody knows how important is understanding your customer, but how to do that in an era of Information Overload?
- Unpopular Opinion – Data Scientists Should Be More End-to-End - Sep 17, 2020.
Can a do-it-all Data Scientist really be more effective at delivering new value from data? While it might sound exhausting, important efficiencies can exist that might bring better value to the business even faster.
- Implementing a Deep Learning Library from Scratch in Python - Sep 17, 2020.
A beginner’s guide to understanding the fundamental building blocks of deep learning platforms.
- Top KDnuggets tweets, Sep 9-15: Will You Enroll At #Google University For $49/Month? Here Are International Alternatives to @Kaggle - Sep 16, 2020.
Will You Enroll At #Google University For $49/Month? On @Kaggle some prizes are only for Americans - here are international alternatives; Advanced #NumPy for #DataScience; Free From MIT: Intro to Computer Science and Programming in Python
- Can Neural Networks Show Imagination? DeepMind Thinks They Can - Sep 16, 2020.
DeepMind has done some of the relevant work in the area of simulating imagination in deep learning systems.
- Online Certificates/Courses in AI, Data Science, Machine Learning from Top Universities - Sep 16, 2020.
We present the online courses and certificates in AI, Data Science, Machine Learning, and related topics from the top 20 universities in the world.
- Autograd: The Best Machine Learning Library You’re Not Using? - Sep 16, 2020.
If there is a Python library that is emblematic of the simplicity, flexibility, and utility of differentiable programming it has to be Autograd.
- KDnuggets™ News 20:n35, Sep 16: Data Science Skills: Core, Emerging, and Most Wanted; Free From MIT: Intro to CS, Programming in Python - Sep 16, 2020.
Check the analysis of latest KDnuggets Poll: which data science skills are core, which are emerging, and what is the most wanted skill readers want to learn; Free From MIT: Intro to CS and Programming in Python; 8 AI/Machine Learning Projects To Make Your Portfolio Stand Out; Statistics with Julia: The Free eBook; and more.
- The Maslow’s hierarchy your data science team needs - Sep 15, 2020.
Domino Data Lab was announced as a leader for the second year in a row in the recently released “Forrester Wave™: Notebook-based Predictive Analytics and Machine Learning (PAML), Q3 2020” analyst report. True to our data science roots, we’ve built a Maslow’s hierarchy of data science team needs.
- DIY Election Fraud Analysis Using Benford’s Law - Sep 15, 2020.
In this article, we will talk about a Do-It-Yourself approach towards election analysis and coming to a conclusion whether the elections were conducted fairly or not.
- Here’s what you need to look for in a model server to build ML-powered services - Sep 15, 2020.
More applications are being infused with machine learning while MLOps processes and best practices are becoming well established. Critical to these software and systems are the servers that run the models, which should feature key capabilities to drive successful enterprise-scale productionizing of machine learning.
- Visualization Of COVID-19 New Cases Over Time In Python - Sep 15, 2020.
Inspired by another concise data visualization, the author of this article has crafted and shared the code for a heatmap which visualizes the COVID-19 pandemic in the United States over time.
- Big Data and AI Toronto Goes Virtual - Sep 14, 2020.
The Big Data and AI Toronto Conference and Expo returns on September 29-30, 2020 with a brand new format and will be held exclusively online. KDnuggets readers get a 25% discount on all-access passes with promo code BDTORONTO-25. Register now.
- Lessons From My First Kaggle Competition - Sep 14, 2020.
How I chose my first Kaggle competition to enter and what I learned from doing it.
- Top Stories, Sep 7-13: Free From MIT: Intro to Computer Science and Programming in Python - Sep 14, 2020.
Also: Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills; AI Papers to Read in 2020; A Deep Learning Dream: Accuracy and Interpretability in a Single Model; Creating Powerful Animated Visualizations in Tableau; 8 AI/Machine Learning Projects To Make Your Portfolio Stand Out
- Deep Learning’s Most Important Ideas - Sep 14, 2020.
In the field of deep learning, there continues to be a deluge of research and new papers published daily. Many well-adopted ideas that have stood the test of time provide the foundation for much of this new work. To better understand modern deep learning, these techniques cover the basic necessary knowledge, especially as a starting point if you are new to the field.
- Statistics with Julia: The Free eBook - Sep 14, 2020.
This free eBook is a draft copy of the upcoming Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence. Interested in learning Julia for data science? This might be the best intro out there.
- Top August Stories: Know What Employers are Expecting for a Data Scientist Role in 2020; If I had to start learning Data Science again, how would I do it? - Sep 11, 2020.
Also: Netflix's Polynote is a New Open Source Framework to Build Better Data Science Notebooks; Must-read NLP and Deep Learning articles for Data Scientists.
- Understanding Bias-Variance Trade-Off in 3 Minutes - Sep 11, 2020.
This article is the write-up of a Machine Learning Lighting Talk, intuitively explaining an important data science concept in 3 minutes.
- Feature Engineering for Numerical Data - Sep 11, 2020.
Data feeds machine learning models, and the more the better, right? Well, sometimes numerical data isn't quite right for ingestion, so a variety of methods, detailed in this article, are available to transform raw numbers into something a bit more palatable.
- An Introduction to NLP and 5 Tips for Raising Your Game - Sep 11, 2020.
This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.
- Math for Programmers - Sep 10, 2020.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer. Save 50% with code kdmath50.
- AI Papers to Read in 2020 - Sep 10, 2020.
Reading suggestions to keep you up-to-date with the latest and classic breakthroughs in AI and Data Science.
- 6 Common Mistakes in Data Science and How To Avoid Them - Sep 10, 2020.
As a novice or seasoned Data Scientist, your work depends on the data, which is rarely perfect. Properly handling the typical issues with data quality and completeness is crucial, and we review how to avoid six of these common scenarios.
- Let’s Be Honest: We’re Drowning in Data - Sep 10, 2020.
The fields of Big Data, Data Analytics/Science, and Data Integration need to face a new truth: We are drowning in data, more and more so every second of every day.
- Top KDnuggets tweets, Sep 02-08: Training alone is never enough to generate effective supervised machine learning models - Sep 9, 2020.
Also: Tensorbook, a #DeepLearning laptop; 5 Concepts Every #dataScientist Should Know: Multicollinearity, encoding, sampling, error, and storytelling; Top 20 #Python AI and #MachineLearning #OpenSource Projects; How To Decide What Data Skills To Learn
- Seven Reasons to Take This Course Before You Go Hands-On with Machine Learning - Sep 9, 2020.
Eric Siegel's new course series on Coursera, Machine Learning for Everyone, is for any learner who wishes to participate in the business deployment of machine learning. This end-to-end, three-course series is accessible to business-level learners and yet vital to techies as well. It covers both the state-of-the-art techniques and the business-side best practices.
- Free From MIT: Intro to Computer Science and Programming in Python - Sep 9, 2020.
This free introductory computer science and programming course is available via MIT's Open Courseware platform. It's a great resource for mastering the fundamentals of one of data science's major requirements.
- 8 AI/Machine Learning Projects To Make Your Portfolio Stand Out - Sep 9, 2020.
If you are just starting down a path toward a career in Data Science, or you are already a seasoned practitioner, then keeping active to advance your experience through side projects is invaluable to take you to the next professional level. These eight interesting project ideas with source code and reference articles will jump start you to thinking outside of the box.
- 4 Tools to Speed Up Your Data Science Writing - Sep 9, 2020.
This article covers how you can achieve your writing goals with these 4 tools.
- KDnuggets™ News 20:n34, Sep 9: Top Online Data Science Masters Degrees; Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills - Sep 9, 2020.
Also: Creating Powerful Animated Visualizations in Tableau; PyCaret 2.1 is here: What's new?; How To Decide What Data Skills To Learn; How to Evaluate the Performance of Your Machine Learning Model
- NIST $240K Challenge: Saving Lives, One Pixel at a Time - Sep 8, 2020.
Video analytics that could save lives and property are just out of reach. A new prize challenge, Enhancing Computer Vision for Public Safety, is designed to help develop a new line of research that will bring such tools closer to reality.
- What Does It Take to be a Successful Data Scientist? - Sep 8, 2020.
What is the right approach to earning your stripes and calling yourself a successful data scientist?
- Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills - Sep 8, 2020.
We analyze the results of the Data Science Skills poll, including 8 categories of skills, 13 core skills that over 50% of respondents have, the emerging/hot skills that data scientists want to learn, and what is the top skill that Data Scientists want to learn.
- 4 Tricks to Effectively Use JSON in Python - Sep 8, 2020.
Working with JSON in Python is a breeze, this will get you started right away.
- Scaling Your Data Compliance Strategy with Legal Automation - Sep 7, 2020.
Join Immuta and the COVID-19 Alliance, a non-profit organization of MIT, for this virtual workshop on Sep 23 @ 1 PM ET, to learn how you can use legal automation to easily scale your data analytics compliance strategy. Register now.
- Top Stories, Aug 31 – Sep 6: Top Online Masters in Analytics, Business Analytics, Data Science – Updated - Sep 7, 2020.
Also: How to Evaluate the Performance of Your Machine Learning Model; Which methods should be used for solving linear regression?; A Curious Theory About the Consciousness Debate in AI; If I had to start learning Data Science again, how would I do it?; PyCaret 2.1 is here: Whats new?
- Creating Powerful Animated Visualizations in Tableau - Sep 7, 2020.
In this post we explore animated data visualization in Tableau,one of the tool's powerful features for making visualizations appealing and interactive.
- 9 Developing Data Science & Analytics Job Trends - Sep 7, 2020.
With so much disruption in 2020 already, a recent report by Burtch Works looks ahead to next year and beyond, and shares insights about how today's hiring market trends may impact our work lives for years to come.
- A Deep Learning Dream: Accuracy and Interpretability in a Single Model - Sep 7, 2020.
IBM Research believes that you can improve the accuracy of interpretable models with knowledge learned in pre-trained models.
- How To Decide What Data Skills To Learn - Sep 4, 2020.
Read this article to learn about getting the most valuable skills on the job market.
- Data Scientists think data is their #1 problem. Here’s why they’re wrong. - Sep 4, 2020.
We tend to think it's all about the data. However, for real data science projects at real organizations in real life, there are more fundamental aspects to consider to do data science right.
- The Most Important Data Science Project - Sep 4, 2020.
What is the project every data scientist must do?
- Book Chapter: The Art of Statistics: Learning from Data - Sep 3, 2020.
Get a free book chapter from "The Art of Statistics: Learning from Data" by a leading researcher Sir David John Spiegelhalter. This excerpt takes a forensic look at data surrounding the victims of the UK most prolific serial killer and shows how a simple search for patterns reveals critical details.
- Design of Experiments in Data Science - Sep 3, 2020.
Read this overview of the process of designing experiments for collecting data.
- How to Evaluate the Performance of Your Machine Learning Model - Sep 3, 2020.
You can train your supervised machine learning models all day long, but unless you evaluate its performance, you can never know if your model is useful. This detailed discussion reviews the various performance metrics you must consider, and offers intuitive explanations for what they mean and how they work.
- 10 Things You Didn’t Know About Scikit-Learn - Sep 3, 2020.
Check out these 10 things you didn’t know about Scikit-Learn... until now.
- Top KDnuggets tweets, Aug 26 – Sep 01: A realistic look at the time spent in a life of a #DataScientist - Sep 2, 2020.
Also: Full Stack #DeepLearning course; Guide to Intelligent #DataScience book - updated/expanded content throughout, now an even better basis for the "educational" part of "becoming a successful data scientist"; Completely Free #MachineLearning Reading List by @vickdata; A Complete #DataScience Portfolio Project
- What Is Data Enrichment And How It Works - Sep 2, 2020.
Learn what is data enrichment, what are the different types, benefits and use cases for data enrichment, and how Smartproxy helps you do it.
- Computer Vision Recipes: Best Practices and Examples - Sep 2, 2020.
This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.
- Which methods should be used for solving linear regression? - Sep 2, 2020.
As a foundational set of algorithms in any machine learning toolbox, linear regression can be solved with a variety of approaches. Here, we discuss. with with code examples, four methods and demonstrate how they should be used.
- Here is What I’ve Learned in 2 Years as a Data Scientist - Sep 2, 2020.
In this article, for the first time, I’ll consolidate everything that I’ve learned and condense all of these into 5 lessons that I’ve learned in 2 years as a data scientist.
- PyCaret 2.1 is here: What’s new? - Sep 1, 2020.
PyCaret is an open-source, low-code machine learning library in Python that automates the machine learning workflow. It is an end-to-end machine learning and model management tool that speeds up the machine learning experiment cycle and makes you 10x more productive. Read about what's new in PyCaret 2.1.
- Top Online Masters in Analytics, Business Analytics, Data Science – Updated - Sep 1, 2020.
We provide an updated list of best online Masters in AI, Analytics, and Data Science, including rankings, tuition, and duration of the education program.
- Showcasing the Benefits of Software Optimizations for AI Workloads on Intel® Xeon® Scalable Platforms - Sep 1, 2020.
The focus of this blog is to bring to light that continued software optimizations can boost performance not only for the latest platforms, but also for the current install base from prior generations. This means customers can continue to extract value from their current platform investments.
- eBook: Vocabularies, Text Mining and FAIR Data: The Strategic Role Information Managers Play - Aug 31, 2020.
How can information managers find strategic roles to play in their organization's AI and data analysis projects? Download this book to learn more.
- A Curious Theory About the Consciousness Debate in AI - Aug 31, 2020.
Dr. Michio Kaku has formulated a very interesting theory of consciousness that applies to AI systems.
- Top Stories, Aug 24-30: If I had to start learning Data Science again, how would I do it?; 4 ways to improve your TensorFlow model – key regularization techniques you need to know - Aug 31, 2020.
Also: The NLP Model Forge: Generate Model Code On Demand; DeepMinds Three Pillars for Building Robust Machine Learning Systems; Beyond the Turing Test; Must-read NLP and Deep Learning articles for Data Scientists; How to Optimize Your CV for a Data Scientist Career
- Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Semantics and Pragmatics - Aug 31, 2020.
Algorithms for text analytics must model how language works to incorporate meaning in language—and so do the people deploying these algorithms. Bender & Lascarides 2019 is an accessible overview of what the field of linguistics can teach NLP about how meaning is encoded in human languages.
- Accelerated Computer Vision: A Free Course From Amazon - Aug 31, 2020.
Amazon's Machine Learning University is making its online courses available to the public, and this time we look at its Accelerated Computer Vision offering.
- Data is everywhere and it powers everything we do! - Aug 28, 2020.
In this article I would like to focus on how companies can start their data-centric strategies and how to achieve success in their data transformation journeys. Have tried to share my thoughts why companies have to consider data at its epitome for their growth, for being competitive, for being smarter, innovative and be prepared for any unforeseen market surprises.
- Beyond the Turing Test - Aug 28, 2020.
With more advancements in AI, it might be time to replace the age-old Turing Test with something better to determine if a machine is thinking. Specifically, a more modern approach might include standard questions designed to probe various facets of intelligence, and comparing the computer to a spectrum of human respondents of different ages, sexes, backgrounds, and abilities.
- Microsoft’s DoWhy is a Cool Framework for Causal Inference - Aug 28, 2020.
Inspired by Judea Pearl’s do-calculus for causal inference, the open source framework provides a programmatic interface for popular causal inference methods.
- Explainable and Reproducible Machine Learning Model Development with DALEX and Neptune - Aug 27, 2020.
With ML models serving real people, misclassified cases (which are a natural consequence of using ML) are affecting peoples’ lives and sometimes treating them very unfairly. It makes the ability to explain your models’ predictions a requirement rather than just a nice to have.
- 4 ways to improve your TensorFlow model – key regularization techniques you need to know - Aug 27, 2020.
Regularization techniques are crucial for preventing your models from overfitting and enables them perform better on your validation and test sets. This guide provides a thorough overview with code of four key approaches you can use for regularization in TensorFlow.
- Working with Spark, Python or SQL on Azure Databricks - Aug 27, 2020.
Here we look at some ways to interchangeably work with Python, PySpark and SQL using Azure Databricks, an Apache Spark-based big data analytics service designed for data science and data engineering offered by Microsoft.