- Free Python Crash Course - Jul 4, 2022.
Python is the most popular programming language in the world. Master it with this free crash course.
Python
- 7 Steps to Mastering Python for Data Science - Jun 30, 2022.
Here’s how you can learn to code in Python from scratch in 7 easy steps.
Python
- How to Stay Current in Python - Jun 27, 2022.
Staying current with Python will help maintain your hireability, get new opportunities, and continue to grow your knowledge.
Python
- Python For Machine Learning: eBook Review - Jun 14, 2022.
The guide to writing production-ready Python code for machine learning projects.
Python
- Pydon’ts – Write elegant Python code: Free Book Review - May 23, 2022.
The book consists of 200 actionable Python insights with a detailed explanation of how to write elegant, compelling, and expressive code.
Python
- The 6 Python Machine Learning Tools Every Data Scientist Should Know About
- May 20, 2022.
Let's look at six must-have tools every data scientist should use.
Python
- Why You Need To Learn Python In 2022 - Apr 28, 2022.
If you don’t already know a programming language, or if you’re deciding to choose another language, have a read and see if Python is for you.
Python
- How to Determine the Best Fitting Data Distribution Using Python - Apr 19, 2022.
Approaches to data sampling, modeling, and analysis can vary based on the distribution of your data, and so determining the best fit theoretical distribution can be an essential step in your data exploration process.
Python
- 5 Different Ways to Load Data in Python - Apr 15, 2022.
Data is the bread and butter of a Data Scientist, so knowing many approaches to loading data for analysis is crucial. Here, five Python techniques to bring in your data are reviewed with code examples for you to follow.
Python
- Data Visualization in Python with Seaborn - Apr 13, 2022.
Learn to create beautiful charts in Python using the Seaborn library.
Python
- Python Libraries Data Scientists Should Know in 2022
- Apr 11, 2022.
Let’s have a look at the Python libraries that every data scientist should know in 2022, to maintain and improve their coding journey.
Python
- Data Ingestion with Pandas: A Beginner Tutorial
- Apr 6, 2022.
Learn tricks on importing various data formats using Pandas with a few lines of code. We will be learning to import SQL databases, Excel sheets, HTML tables, CSV, and JSON files with examples.
Python
- Introductory Pandas Tutorial - Mar 31, 2022.
A gentle introduction to data analysis with Pandas.
Python
- Building a Geospatial Application in Python with Google Earth Engine and Greppo - Mar 18, 2022.
In this blog, you will see how to build a web-application with Greppo and Google Earth using Python.
Python
- How to Manage Multiple Inheritance in Python - Mar 16, 2022.
In this guide, we'll learn how to use multiple inheritance in Python and make it sustainable.
Python
- How to Engineer Date Features in Python - Mar 15, 2022.
This article discusses and demonstrates how to quickly engineer some common date features using Python.
Python
- How to Filter Data with Python - Feb 22, 2022.
Let’s dive a little deeper into some simple operations that might make your everyday work a little easier.
Python
- Managing Your Reusable Python Code as a Data Scientist - Feb 11, 2022.
Here are a few approaches that I have settled on for managing my own reusable Python code as a data scientist, presented from most to least general code use, and aimed at beginners.
Python
- Build a Web Scraper with Python in 5 Minutes - Feb 7, 2022.
In this article, I will show you how to create a web scraper from scratch in Python.
Python
- The Best Python Courses: An Analysis Summary - Jan 24, 2022.
What does the data reveal if we ask: "What are the 10 Best Python Courses?". Collecting almost all of the courses from top platforms shows there are plenty to choose from, with over 3000 offerings. This article summarizes my analysis and presents the top three courses.
Python
- How to Process a DataFrame with Millions of Rows in Seconds - Jan 18, 2022.
TLDR; process it with a new Python Data Processing Engine in the Cloud.
Python
- Automate Microsoft Excel and Word Using Python - Jan 6, 2022.
Integrate Excel with Word to generate automated reports seamlessly.
Python
- Why are More Developers Using Python for Their Machine Learning Projects?
- Jan 4, 2022.
To support the creation of new and exciting ML and artificial intelligence (AI) applications, developers need a robust programming language. That's where the Python programming language comes in.
Python
What Makes Python An Ideal Programming Language For Startups - Dec 31, 2021.
In this blog, we will discuss what makes Python so popular, its features, and why you should consider Python as a programming language for your startup.
Programming Languages, Python, Startups
3 Tools to Track and Visualize the Execution of Your Python Code - Dec 30, 2021.
Avoid headaches when debugging in one line of code.
Programming, Python, Tools, Visualization
- Hands-On Reinforcement Learning Course, Part 2 - Dec 28, 2021.
Continue your learning journey in Reinforcement Learning with this second of two part tutorial that covers the foundations of the technique with examples and Python code.
Agents, Beginners, Python, Reinforcement Learning
- The Easiest Way to Make Beautiful Interactive Visualizations With Pandas - Dec 28, 2021.
Check out these one-liner interactive visualization with Pandas in Python.
Data Visualization, Interactive, Pandas, Python
- Alternative Feature Selection Methods in Machine Learning - Dec 24, 2021.
Feature selection methodologies go beyond filter, wrapper and embedded methods. In this article, I describe 3 alternative algorithms to select predictive features based on a feature importance score.
Data Preparation, Feature Selection, Machine Learning, Python
- Hands-On Reinforcement Learning Course, Part 1 - Dec 22, 2021.
Start your learning journey in Reinforcement Learning with this first of two part tutorial that covers the foundations of the technique with examples and Python code.
Agents, Beginners, Python, Reinforcement Learning
- KDnuggets™ News 21:n48, Dec 22: Write Clean Python Code Using Pipes; 5 Key Skills Needed To Become a Great Data Scientist - Dec 22, 2021.
Write Clean Python Code Using Pipes; 5 Key Skills Needed To Become a Great Data Scientist; A Full End-to-End Deployment of a Machine Learning Algorithm into a Live Production Environment; The 5 Characteristics of a Successful Data Scientist; Top Resources for Learning Statistics for Data Science
Career Advice, Data Science, Programming, Python
Three R Libraries Every Data Scientist Should Know (Even if You Use Python) - Dec 20, 2021.
Check out these powerful R libraries built by the world’s biggest tech companies.
Data Science, Data Scientist, Python, R
- 10 Key AI & Data Analytics Trends for 2022 and Beyond - Dec 17, 2021.
What AI and data analytics trends are taking the industry by storm this year? This comprehensive review highlights upcoming directions in AI to carefully watch and consider implementing in your personal work or organization.
2022 Predictions, AI, Data, Data Analysis, Deep Learning, Environment, Low-Code, No-Code, Python, Trends
Write Clean Python Code Using Pipes - Dec 15, 2021.
A short and clean approach to processing iterables.
Programming, Python
- Introduction to Clustering in Python with PyCaret - Dec 13, 2021.
A step-by-step, beginner-friendly tutorial for unsupervised clustering tasks in Python using PyCaret.
Clustering, Machine Learning, PyCaret, Python
- Analyzing Scientific Articles with fine-tuned SciBERT NER Model and Neo4j - Dec 9, 2021.
In this article, we will be analyzing a dataset of scientific abstracts using the Neo4j Graph database and a fine-tuned SciBERT model.
BERT, Graph Analytics, Neo4j, NLP, Python, Research
- Introduction to Binary Classification with PyCaret - Dec 7, 2021.
PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. See how to use it for binary classification.
Classification, Machine Learning, PyCaret, Python
- A Beginner’s Guide to End to End Machine Learning - Dec 6, 2021.
Learn to train, tune, deploy and monitor machine learning models.
Beginners, Machine Learning, MLflow, PyCaret, Python
- Using PyCaret’s New Time Series Module - Dec 3, 2021.
PyCaret’s new time series module is now available in beta. Staying true to the simplicity of PyCaret, it is consistent with the existing API and comes with a lot of functionalities.
Machine Learning, PyCaret, Python, Time Series
- Avoid These Mistakes with Time Series Forecasting - Dec 2, 2021.
A few checks to make before training a Machine Learning model on data that could be random.
Forecasting, Mistakes, Python, Time Series
- PyCaret 2.3.5 Is Here! Learn What’s New - Nov 26, 2021.
Read about the new functionalities added in PyCaret’s recent release.
Open Source, PyCaret, Python
- A Spreadsheet that Generates Python: The Mito JupyterLab Extension - Nov 25, 2021.
You can call Mito into your Jupyter Environment and each edit you make will generate the equivalent Python in the code cell below.
Jupyter, Programming, Python, Spreadsheet
- 5 Advanced Tips on Python Sequences - Nov 23, 2021.
Notes from Fluent Python by Luciano Ramalho.
Programming, Python
- Dask DataFrame is not Pandas - Nov 22, 2021.
This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The next article in the series is about parallelizing for loops, and other embarrassingly parallel operations with dask.delayed.
Dask, Pandas, Python, Saturn Cloud
- Build a Serverless News Data Pipeline using ML on AWS Cloud - Nov 18, 2021.
This is the guide on how to build a serverless data pipeline on AWS with a Machine Learning model deployed as a Sagemaker endpoint.
AWS, NLP, Pipeline, Python, Sagemaker, Text Summarization
- Easy Synthetic Data in Python with Faker - Nov 17, 2021.
Faker is a Python library that generates fake data to supplement or take the place of real world data. See how it can be used for data science.
Data Science, Python, Synthetic Data
- Deep Learning on your phone: PyTorch C++ API for use on Mobile Platforms - Nov 12, 2021.
The PyTorch Deep Learning framework has a C++ API for use on mobile platforms. This article shows an end-to-end demo of how to write a simple C++ application with Deep Learning capabilities using the PyTorch C++ API such that the same code can be built for use on mobile platforms (both Android and iOS).
C++, Deep Learning, Mobile, Python, PyTorch
- 25 Github Repositories Every Python Developer Should Know - Nov 12, 2021.
Check out these repositories to help you improve your data science skills.
GitHub, Programming, Python
- What Comes After HDF5? Seeking a Data Storage Format for Deep Learning - Nov 9, 2021.
In this article we are discussing that HDF5 is one of the most popular and reliable formats for non-tabular, numerical data. But this format is not optimized for deep learning work. This article suggests what kind of ML native data format should be to truly serve the needs of modern data scientists.
Data Management, Deep Learning, Python
- KDnuggets™ News 21:n42, Nov 3: Google Recommendations Before Taking Their Machine Learning Course; Guide to Data Science Jobs - Nov 3, 2021.
What Google Recommends You do Before Taking Their Machine Learning or Data Science Course; A Guide to 14 Different Data Science Jobs; Analyze Python Code in Jupyter Notebooks; Machine Learning Model Development and Model Operations: Principles and Practices; Want to Join a Bank? Everything Data Scientists Need to Know About Working in Fintech
Career Advice, Data Science, Finance, Google, Jobs, Jupyter, Machine Learning, Python
ORDAINED: The Python Project Template - Nov 2, 2021.
Recently I decided to take the time to better understand the Python packaging ecosystem and create a project boilerplate template as an improvement over copying a directory tree and doing find and replace.
Development, Programming, Project, Python
- Advanced PyTorch Lightning with TorchMetrics and Lightning Flash - Nov 1, 2021.
In this tutorial we will be diving deeper into two additional tools you should be using: TorchMetrics and Lightning Flash. TorchMetrics unsurprisingly provides a modular approach to define and track useful metrics across batches and devices, while Lightning Flash offers a suite of functionality facilitating more efficient transfer learning and data handling, and a recipe book of state-of-the-art approaches to typical deep learning problems.
Metrics, Python, PyTorch, PyTorch Lightning, Transfer Learning
- Simple Text Scraping, Parsing, and Processing with this Python Library - Oct 29, 2021.
Scraping, parsing, and processing text data from the web can be difficult. But it can also be easy, using Newspaper3k.
Data Processing, NLP, Python, Text Analytics, Web Scraping
- Analyze Python Code in Jupyter Notebooks - Oct 28, 2021.
We present a new tool that integrates modern code analysis techniques with Jupyter notebooks and helps developers find bugs as they write code.
Jupyter, Programming, Python
- Getting Started with PyTorch Lightning - Oct 26, 2021.
As a library designed for production research, PyTorch Lightning streamlines hardware support and distributed training as well, and we’ll show how easy it is to move training to a GPU toward the end.
Deep Learning, Machine Learning, Python, PyTorch, PyTorch Lightning
Introduction to AutoEncoder and Variational AutoEncoder (VAE) - Oct 22, 2021.
Autoencoders and their variants are interesting and powerful artificial neural networks used in unsupervised learning scenarios. Learn how autoencoders perform in their different approaches and how to implement with Keras on the instructional data set of the MNIST digits.
Autoencoder, Deep Learning, Machine Learning, Python
- Find the Best-Matching Distribution for Your Data Effortlessly - Oct 22, 2021.
How to find the best-matching statistical distributions for your data points — in an automated and easy way. And, then how to extend the utility further.
Distribution, Python, Statistics, Synthetic Data
- Training BPE, WordPiece, and Unigram Tokenizers from Scratch using Hugging Face - Oct 21, 2021.
Comparing the tokens generated by SOTA tokenization algorithms using Hugging Face's tokenizers package.
Hugging Face, NLP, Python, Tokenization
- KDnuggets™ News 21:n40, Oct 20: The 20 Python Packages You Need For Machine Learning and Data Science; Ace Data Science Interviews with Portfolio Projects - Oct 20, 2021.
The 20 Python Packages You Need For Machine Learning and Data Science; How to Ace Data Science Interview by Working on Portfolio Projects; Deploying Your First Machine Learning API; Real Time Image Segmentation Using 5 Lines of Code; What is Clustering and How Does it Work?
Clustering, Computer Vision, Data Science, Image Recognition, Interview, Machine Learning, Portfolio, Python
- Real Time Image Segmentation Using 5 Lines of Code - Oct 18, 2021.
PixelLib Library is a library created to allow easy integration of object segmentation in images and videos using few lines of python code. PixelLib now provides support for PyTorch backend to perform faster, more accurate segmentation and extraction of objects in images and videos using PointRend segmentation architecture.
Computer Vision, Image Processing, Machine Learning, Python, Segmentation
- Serving ML Models in Production: Common Patterns - Oct 18, 2021.
Over the past couple years, we've seen 4 common patterns of machine learning in production: pipeline, ensemble, business logic, and online learning. In the ML serving space, implementing these patterns typically involves a tradeoff between ease of development and production readiness. Ray Serve was built to support these patterns by being both easy to develop and production ready.
FastAPI, Machine Learning, Production, Python, Ray
Deploying Your First Machine Learning API - Oct 14, 2021.
Effortless way to develop and deploy your machine learning API using FastAPI and Deta.
API, Deployment, FastAPI, Machine Learning, Python, spaCy

The 20 Python Packages You Need For Machine Learning and Data Science - Oct 14, 2021.
Do you do Python? Do you do data science and machine learning? Then, you need to do these crucial Python libraries that enable nearly all you will want to do.
Data Science, Keras, Machine Learning, Matplotlib, numpy, Pandas, Plotly, Python, PyTorch, scikit-learn, TensorFlow
- Building Multimodal Models: Using the widedeep Pytorch package - Oct 13, 2021.
This article gets you started on the open-source widedeep PyTorch framework developed by Javier Rodriguez Zaurin.
Machine Learning, Modeling, Python, PyTorch
- Create Synthetic Time-series with Anomaly Signatures in Python - Oct 12, 2021.
A simple and intuitive way to create synthetic (artificial) time-series data with customized anomalies — particularly suited to industrial applications.
Anomalies, Python, Synthetic Data, Time Series
- AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch - Oct 11, 2021.
AutoML is a broad category of techniques and tools for applying automated search to your automated search and learning to your learning. In addition to Auto-Sklearn, the Freiburg-Hannover AutoML group has also developed an Auto-PyTorch library. We’ll use both of these as our entry point into AutoML in the following simple tutorial.
Automated Machine Learning, AutoML, Python, PyTorch, scikit-learn
- The Evolution of Tokenization – Byte Pair Encoding in NLP - Oct 7, 2021.
Though we have SOTA algorithms for tokenization, it's always a good practice to understand the evolution trail and learning how have we reached here. Read this introduction to Byte Pair Encoding.
NLP, Python, Tokenization
- How to do “Limitless” Math in Python - Oct 7, 2021.
How to perform arbitrary-precision computation and much more math (and fast too) than what is possible with the built-in math library in Python.
Linear Algebra, Mathematics, Probability, Python, Statistics
Here’s Why You Need Python Skills as a Machine Learning Engineer - Oct 6, 2021.
If you want to learn how to apply Python programming skills in the context of AI applications, the UC San Diego Extension Machine Learning Engineering Bootcamp can help. Read on to find out more about how machine learning engineers use Python, and why the language dominates today’s machine learning landscape.
Bootcamp, Machine Learning Engineer, Online Education, Python, UCSD
- Parallelizing Python Code - Oct 4, 2021.
This article reviews some common options for parallelizing Python code, including process-based parallelism, specialized libraries, ipython parallel, and Ray.
Distributed Computing, Parallelism, Programming, Python, Ray
Teaching AI to Classify Time-series Patterns with Synthetic Data - Oct 1, 2021.
How to build and train an AI model to identify various common anomaly patterns in time-series data.
AI, Classification, Python, Synthetic Data, Time Series
- How to Auto-Detect the Date/Datetime Columns and Set Their Datatype When Reading a CSV File in Pandas - Oct 1, 2021.
When read_csv( ) reads e.g. “2021-03-04” and “2021-03-04 21:37:01.123” as mere “object” datatypes, often you can simply auto-convert them all at once to true datetime datatypes.
Data Processing, Pandas, Python
How To Build A Database Using Python - Sep 28, 2021.
Implement your database without handling the SQL using the Flask-SQLAlchemy library.
Databases, Flask, Python, SQL
- Building a Structured Financial Newsfeed Using Python, SpaCy and Streamlit - Sep 28, 2021.
Getting started with NLP by building a Named Entity Recognition(NER) application.
Finance, NLP, Python, spaCy, Streamlit

Path to Full Stack Data Science - Sep 27, 2021.
Start your journey toward mastering all aspects of the field of Data Science with this focused list of in-depth self-learning resources. Curated with the beginner in mind, these recommendations will help you learn efficiently, and can also offer existing professionals useful highlights for review or help filling in any gaps in skills.
Career Advice, Data Science, Data Science Education, Data Visualization, Mathematics, Python, R, Roadmap
- Zero to RAPIDS in Minutes with NVIDIA GPUs + Saturn Cloud - Sep 27, 2021.
Managing large-scale data science infrastructure presents significant challenges. With Saturn Cloud, managing GPU-based infrastructure is made easier, allowing practitioners and enterprises to focus on solving their business challenges.
GPU, NVIDIA, Python, Saturn Cloud
- How To Deal With Imbalanced Classification, Without Re-balancing the Data - Sep 23, 2021.
Before considering oversampling your skewed data, try adjusting your classification decision threshold, in Python.
Balancing Classes, Classification, Python, Unbalanced
- 9 Outstanding Reasons to Learn Python for Finance - Sep 23, 2021.
Is Python good for learning finance and working in the financial world? The answer is not only a resounding YES, but yes for nine very good reasons. This article gets into the details behind why Python is a must-know programming language for anyone who wants to work in the financial sector.
Finance, Python
- KDnuggets™ News 21:n36, Sep 22: The Machine & Deep Learning Compendium Open Book; Easy SQL in Native Python - Sep 22, 2021.
The Machine & Deep Learning Compendium Open Book; Easy SQL in Native Python; Introduction to Automated Machine Learning; How to be a Data Scientist without a STEM degree; What Is The Real Difference Between Data Engineers and Data Scientists?
Automated Machine Learning, AutoML, Books, Data Engineer, Data Scientist, Machine Learning, Python, SQL
- 15 Must-Know Python String Methods - Sep 21, 2021.
It is not always about numbers.
Data Processing, NLP, Python, Text Analytics
- If You Can Write Functions, You Can Use Dask - Sep 21, 2021.
This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The first article in the series is about using LocalCluster.
Cloud, Dask, Python, Saturn Cloud
How to be a Data Scientist without a STEM degree - Sep 20, 2021.
Breaking into data science as a professional does require technical skills, a well-honed knack for problem-solving, and a willingness to swim in oceans of data. Maybe you are coming in as a career change or ready to take a new learning path in life--without having previously earned an advanced degree in a STEM field. Follow these tips to find your way into this high-demand and interesting field.
Career Advice, Data Science Education, Data Scientist, Project, Python, SQL
- Adventures in MLOps with Github Actions, Iterative.ai, Label Studio and NBDEV - Sep 16, 2021.
This article documents the authors' experience building their custom MLOps approach.
GitHub, Machine Learning, MLOps, Pipeline, Python, Workflow
- Introduction to Automated Machine Learning - Sep 15, 2021.
AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.
Automated Machine Learning, AutoML, Machine Learning, Python
- How to get Python PCAP Certification: Roadmap, Resources, Tips For Success, Based On My Experience - Sep 15, 2021.
Follow this journey of personal experience -- with useful tips and learning resources -- to help you achieve the PCAP Certification, one of the most reputed Python Certifications, to validate your knowledge against International Standards.
Advice, Certification, Python, Tips
- 5 Must Try Awesome Python Data Visualization Libraries - Sep 15, 2021.
The goal of data visualization is to communicate data or information clearly and effectively to readers. Here are 5 must try awesome Python libraries for helping you do so, with overviews and links to quick start guides for each.
Data Visualization, Matplotlib, Plotly, Python, Seaborn
- KDnuggets™ News 21:n35, Sep 15: A Data Science Portfolio That Will Land You The Job; Top 18 Low-Code and No-Code Machine Learning Platforms - Sep 15, 2021.
Here is a Data Science Portfolio that will land you the job; Review the top 18 Low-Code and No-Code Machine Learning platforms; Try these 8 Deep Learning Project Ideas for Beginners; Very useful - working with Python APIs for data science project.
API, Deep Learning, Low-Code, No-Code, Portfolio, Project, Python
- An Introduction to Reinforcement Learning with OpenAI Gym, RLlib, and Google Colab - Sep 14, 2021.
Get an Introduction to Reinforcement Learning by attempting to balance a virtual CartPole with OpenAI Gym, RLlib, and Google Colab.
Google Colab, OpenAI, Python, Reinforcement Learning
- The Prefect Way to Automate & Orchestrate Data Pipelines - Sep 13, 2021.
I am migrating all my ETL work from Airflow to this super-cool framework.
Airflow, Data Workflow, Pipeline, Prefect, Python
- Working with Python APIs For Data Science Project - Sep 10, 2021.
In this article, we will work with YouTube Python API to collect video statistics from our channel using the requests python library to make an API call and save it as a Pandas DataFrame.
API, Data Science, Project, Python
- How to Create an AutoML Pipeline Optimization Sandbox - Sep 9, 2021.
In this article, we will implement an automated machine learning pipeline optimization sandbox web app using Streamlit and TPOT.
Automated Machine Learning, AutoML, Python, Streamlit
- KDnuggets™ News 21:n34, Sep 8: Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained - Sep 8, 2021.
Do You Read Excel Files with Python? There is a 1000x Faster Way; Hypothesis Testing Explained; Data Science Cheat Sheet 2.0; 6 Cool Python Libraries That I Came Across Recently; Best Resources to Learn Natural Language Processing in 2021
AI, Cheat Sheet, Data Science, Excel, Hypothesis Testing, Machine Learning, Python, Statistics

How to Create Stunning Web Apps for your Data Science Projects - Sep 7, 2021.
Data scientists do not have to learn HTML, CSS, and JavaScript to build web pages.
Apps, Data Science, Python, Streamlit
- Fast AutoML with FLAML + Ray Tune - Sep 6, 2021.
Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.
Automated Machine Learning, AutoML, Hyperparameter, Machine Learning, Microsoft, Python, Ray
- 6 Cool Python Libraries That I Came Across Recently - Sep 3, 2021.
Check out these awesome Python libraries for Machine Learning.
Data Science, Machine Learning, Python

Do You Read Excel Files with Python? There is a 1000x Faster Way - Sep 1, 2021.
In this article, I’ll show you five ways to load data in Python. Achieving a speedup of 3 orders of magnitude.
Excel, Microsoft, Pandas, Python, Scalability
- KDnuggets™ News 21:n33, Sep 1: Top Industries Hiring Data Scientists; The Most Important Tool for Data Engineers - Sep 1, 2021.
The top industries hiring Data Scientists; The most important tool for data engineers (hint - it is not technical); How to Engineer Date Features in Python; 15 Python Snippets to Optimize your Data Science Pipeline
Data Engineer, Data Science, Hiring, Industry, Pipeline, Python
- NLP Insights for the Penguin Café Orchestra - Aug 31, 2021.
We give an example of how to use Expert.ai and Python to investigate favorite music albums.
Expert.ai, Music, NLP, Python
- CSV Files for Storage? No Thanks. There’s a Better Option - Aug 31, 2021.
Saving data to CSV’s is costing you both money and disk space. It’s time to end it.
Data Management, Pandas, Parquet, Python
- A Python Data Processing Script Template - Aug 31, 2021.
Here's a skeleton general purpose template for getting a Python command line script fleshed out as quickly as possible.
Programming, Python
- Introducing Packed BERT for 2x Training Speed-up in Natural Language Processing - Aug 30, 2021.
Check out this new BERT packing algorithm for more efficient training.
BERT, NLP, Python, Training
- How causal inference lifts augmented analytics beyond flatland - Aug 27, 2021.
In our quest to better understand and predict business outcomes, traditional predictive modeling tends to fall flat. However, causal inference techniques along with business analytics approaches can unravel what truly changes your KPIs.
Analytics, Causality, Data Science, Python, Regression
- 15 Python Snippets to Optimize your Data Science Pipeline - Aug 25, 2021.
Quick Python solutions to help your data science cycle.
Data Science, Optimization, Pipeline, Python
- KDnuggets™ News 21:n32, Aug 25: Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications - Aug 25, 2021.
Open Source Datasets for Computer Vision; Django’s 9 Most Common Applications; How to Select an Initial Model for your Data Science Problem; Automate Microsoft Excel and Word Using Python; Stack Overflow Survey Data Science Highlights
Computer Vision, Datasets, Django, Microsoft, Modeling, Open Source, Python, StackOverflow
Learning Data Science and Machine Learning: First Steps After The Roadmap - Aug 24, 2021.
Just getting into learning data science may seem as daunting as (if not more than) trying to land your first job in the field. With so many options and resources online and in traditional academia to consider, these pre-requisites and pre-work are recommended before diving deep into data science and AI/ML.
Data Science, Machine Learning, Mathematics, Python, Roadmap, Statistics
Django’s 9 Most Common Applications - Aug 23, 2021.
Django is a Python web application framework enjoying widespread adoption in the data science community. But what else can you use Django for? Read this article for 9 use cases where you can put Django to work.
Django, Programming, Python
- 5 Things That Make My Job as a Data Scientist Easier - Aug 23, 2021.
After working as a Data Scientist for a year, I am here to share some things I learnt along the way that I feel are helpful and have increased my efficiency. Hopefully some of these tips can help you in your journey :)
Data Science, Data Scientist, Metrics, Pandas, Plotly, Python, Time Series, Visualization
- Data Scientist’s Guide to Efficient Coding in Python - Aug 18, 2021.
Read this fantastic collection of tips and tricks the author uses for writing clean code on a day-to-day basis.
Programming, Python, Tips
- Linear Algebra for Natural Language Processing - Aug 17, 2021.
Learn about representing word semantics in vector space.
Linear Algebra, Mathematics, NLP, Python
Prefect: How to Write and Schedule Your First ETL Pipeline with Python - Aug 16, 2021.
Workflow management systems made easy — both locally and in the cloud.
Cloud, ETL, Pipeline, Python
- Writing Your First Distributed Python Application with Ray - Aug 16, 2021.
Using Ray, you can take Python code that runs sequentially and transform it into a distributed application with minimal code changes. Read on to find out why you should use Ray, and how to get started.
Distributed Computing, Parallelism, Python, Workflow
- How to Train a BERT Model From Scratch - Aug 13, 2021.
Meet BERT’s Italian cousin, FiliBERTo.
BERT, Hugging Face, NLP, Python, Training
How to Query Your Pandas Dataframe - Aug 9, 2021.
A Data Scientist’s perspective on SQL-like Python functions.
Data Preprocessing, Data Processing, Pandas, Python, SQL
GPU-Powered Data Science (NOT Deep Learning) with RAPIDS - Aug 2, 2021.
How to utilize the power of your GPU for regular data science and machine learning even if you do not do a lot of deep learning work.
Data Science, GPU, Python
- KDnuggets™ News 21:n28, Jul 28: Design patterns in machine learning; The Best NLP Course is Free - Jul 28, 2021.
What are the Design patterns for Machine Learning and why you should know them? For more advanced readers, how to use Kafka Connect to create an open source data pipeline for processing real-time data; The state-of-the-art NLP course is freely available; Python Data Structures Compared; Update your Machine Learning skills this summer.
Kafka, Machine Learning, NLP, Python
- Python Data Structures Compared - Jul 27, 2021.
Let's take a look at 5 different Python data structures and see how they could be used to store data we might be processing in our everyday tasks, as well as the relative memory they use for storage and time they take to create and access.
Data Science, Programming, Python
Why and how should you learn “Productive Data Science”? - Jul 26, 2021.
What is Productive Data Science and what are some of its components?
Books, Career Advice, Courses, Data Science, Python
- Top Python Data Science Interview Questions - Jul 23, 2021.
Six must-know technical concepts and two types of questions to test them.
Data Science, Interview Questions, Programming, Python
- Overview of Albumentations: Open-source library for advanced image augmentations - Jul 22, 2021.
With code snippets on augmentations and integrations with PyTorch and Tensorflow pipelines.
Image Processing, Open Source, Python, PyTorch, TensorFlow
- ColabCode: Deploying Machine Learning Models From Google Colab - Jul 22, 2021.
New to ColabCode? Learn how to use it to start a VS Code Server, Jupyter Lab, or FastAPI.
Deployment, FastAPI, Google Colab, Machine Learning, Python
- Understanding BERT with Hugging Face - Jul 20, 2021.
We don’t really understand something before we implement it ourselves. So in this post, we will implement a Question Answering Neural Network using BERT and a Hugging Face Library.
BERT, Hugging Face, NLP, Python
- How Much Memory is your Machine Learning Code Consuming? - Jul 19, 2021.
Learn how to quickly check the memory footprint of your machine learning function/module with one line of command. Generate a nice report too.
Machine Learning, Programming, Python

Top 6 Data Science Online Courses in 2021 - Jul 15, 2021.
As an aspiring data scientist, it is easy to get overwhelmed by the abundance of resources available on the Internet. With these 6 online courses, you can develop yourself from a novice to experienced in less than a year, and prepare you with the skills necessary to land a job in data science.
Data Science Education, Online Education, Programming, Python, SQL
- Date Processing and Feature Engineering in Python - Jul 15, 2021.
Have a look at some code to streamline the parsing and processing of dates in Python, including the engineering of some useful and common features.
Beginners, Data Preprocessing, Data Processing, Feature Engineering, Python, Time Series
- KDnuggets™ News 21:n26, Jul 14: Pandas not enough? Here are a few good alternatives to processing larger and faster data in Python; 5 Python Data Processing Tips - Jul 14, 2021.
If Pandas not enough, here are a few good alternatives to processing larger and faster data in Python; 5 Python Data Processing Tips and Code Snippets; Relax! Data Scientists will not go extinct in 10 years, but the role will change; How to Get Practical Data Science Experience to be Career-Ready.
Pandas, Python, Trends
- How to Tell if You Have Trained Your Model with Enough Data - Jul 12, 2021.
WeightWatcher is an open-source, diagnostic tool for evaluating the performance of (pre)-trained and fine-tuned Deep Neural Networks. It is based on state-of-the-art research into Why Deep Learning Works.
Learning, Neural Networks, Python, Training
5 Python Data Processing Tips & Code Snippets - Jul 9, 2021.
This is a small collection of Python code snippets that a beginner might find useful for data processing.
Data Preprocessing, Data Processing, Pandas, Programming, Python

Pandas not enough? Here are a few good alternatives to processing larger and faster data in Python - Jul 8, 2021.
While the Pandas library remains a crucial workhorse in data processing and management for data science, some limitations exist that can impact efficiencies, especially with very large data sets. Here, a few interesting alternatives to Pandas are introduced to improve your large data handling performance.
Dask, Modin, Pandas, Python, Scalability
- How to Build An Image Classifier in Few Lines of Code with Flash - Jul 7, 2021.
Introducing Flash: The high-level deep learning framework for beginners.
Deep Learning, Image Classification, Image Recognition, Neural Networks, Python
- KDnuggets™ News 21:n25, Jul 7: Data Scientists and ML Engineers Are Luxury Employees; 5 Lessons from McKinsey That Will Make You a Better Data Scientist - Jul 7, 2021.
Are Data Scientists and ML Engineers Luxury Employees? 5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist; Managing Your Reusable Python Code as a Data Scientist; GitHub Copilot: Your AI pair programmer - what is all the fuss about? and more.
Career Advice, Data Science Skills, Data Scientist, Machine Learning Engineer, Python
- ROC Curve Explained - Jul 6, 2021.
Learn to visualise a ROC curve in Python.
Data Visualization, Metrics, Python, ROC-AUC