- Here are the Most Popular Python IDEs/Editors - Oct 6, 2020.
Jupyter Notebook continues to lead as the most popular Python IDE, but its share has declined since the last poll. The top 4 contenders have remained the same, but only one has significantly improved its share. We also examine the breakdown by employment and region.
IDE, Jupyter, Poll, PyCharm, Python, Visual Studio Code
10 Best Machine Learning Courses in 2020 - Oct 6, 2020.
If you are ready to take your career in machine learning to the next level, then these top 10 Machine Learning Courses covering both practical and theoretical work will help you excel.
Courses, DataCamp, Deep Learning, fast.ai, Machine Learning, Online Education, Python, Stanford
- Your Guide to Linear Regression Models - Oct 5, 2020.
This article explains linear regression and how to program linear regression models in Python.
Linear Regression, Python
- Key Machine Learning Technique: Nested Cross-Validation, Why and How, with Python code - Oct 5, 2020.
Selecting the best performing machine learning model with optimal hyperparameters can sometimes still end up with a poorer performance once in production. This phenomenon might be the result of tuning the model and evaluating its performance on the same sets of train and test data. So, validating your model more rigorously can be key to a successful outcome.
Cross-validation, Machine Learning, Python
- KDnuggets™ News 20:n37, Sep 30: Introduction to Time Series Analysis in Python; How To Improve Machine Learning Model Accuracy - Sep 30, 2020.
Learn how to work with time series in Python; Tips for improving Machine Learning model accuracy from 80% to over 90%; Geographical Plots with Python; Best methods for making Python programs blazingly fast; Read a complete guide to PyTorch; KDD Best Paper Awards and more.
Accuracy, Geospatial, KDD, Performance, Python, PyTorch, Time Series
- Looking Inside The Blackbox: How To Trick A Neural Network - Sep 28, 2020.
In this tutorial, I’ll show you how to use gradient ascent to figure out how to misclassify an input.
Neural Networks, Python, PyTorch, PyTorch Lightning
Geographical Plots with Python - Sep 28, 2020.
When your data includes geographical information, rich map visualizations can offer significant value for you to understand your data and for the end user when interpreting analytical results.
Choropleth, Data Visualization, Geospatial, Plotly, Python
- Making Python Programs Blazingly Fast - Sep 25, 2020.
Let’s look at the performance of our Python programs and see how to make them up to 30% faster!
Development, Optimization, Programming, Python
- Create and Deploy your First Flask App using Python and Heroku - Sep 25, 2020.
Flask is a straightforward and lightweight web application framework for Python applications. This guide walks you through how to write an application using Flask with a deployment on Heroku.
App, Deployment, Flask, Heroku, Python
Introduction to Time Series Analysis in Python - Sep 24, 2020.
Data that is updated in real-time requires additional handling and special care to prepare it for machine learning models. The important Python library, Pandas, can be used for most of this work, and this tutorial guides you through this process for analyzing time-series data.
Pandas, Python, time, Time Series
- The Most Complete Guide to PyTorch for Data Scientists - Sep 24, 2020.
All the PyTorch functionality you will ever need while doing Deep Learning. From an Experimentation/Research Perspective.
Data Science, Data Scientist, Neural Networks, Python, PyTorch
- KDnuggets™ News 20:n36, Sep 23: New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project - Sep 23, 2020.
New Poll: What Python IDE / Editor you used the most in 2020?; Automating Every Aspect of Your Python Project; Autograd: The Best Machine Learning Library You're Not Using?; Implementing a Deep Learning Library from Scratch in Python; Online Certificates/Courses in AI, Data Science, Machine Learning; Can Neural Networks Show Imagination?
Automation, Certificate, Courses, Data Science, Deep Learning, DeepMind, Machine Learning, Neural Networks, Python
- New Poll: What Python IDE / Editor you used the most in 2020? - Sep 22, 2020.
The latest KDnuggets polls asks which Python IDE / Editor you have used the most in 2020. Participate now, and share your experiences with the community.
Data Science, Development, IDE, Poll, Programming, Python
- Statistical and Visual Exploratory Data Analysis with One Line of Code - Sep 21, 2020.
If EDA is not executed correctly, it can cause us to start modeling with “unclean” data. See how to use Pandas Profiling to perform EDA with a single line of code.
Data Exploration, Data Visualization, Pandas, Python
Automating Every Aspect of Your Python Project - Sep 18, 2020.
Every Python project can benefit from automation using Makefile, optimized Docker images, well configured CI/CD, Code Quality Tools and more…
Development, DevOps, Docker, Programming, Python
Implementing a Deep Learning Library from Scratch in Python - Sep 17, 2020.
A beginner’s guide to understanding the fundamental building blocks of deep learning platforms.
Deep Learning, Neural Networks, Python
Autograd: The Best Machine Learning Library You’re Not Using? - Sep 16, 2020.
If there is a Python library that is emblematic of the simplicity, flexibility, and utility of differentiable programming it has to be Autograd.
Deep Learning, Neural Networks, Python, PyTorch
- KDnuggets™ News 20:n35, Sep 16: Data Science Skills: Core, Emerging, and Most Wanted; Free From MIT: Intro to CS, Programming in Python - Sep 16, 2020.
Check the analysis of latest KDnuggets Poll: which data science skills are core, which are emerging, and what is the most wanted skill readers want to learn; Free From MIT: Intro to CS and Programming in Python; 8 AI/Machine Learning Projects To Make Your Portfolio Stand Out; Statistics with Julia: The Free eBook; and more.
Deep Learning, Julia, Kaggle, MIT, Python
- Visualization Of COVID-19 New Cases Over Time In Python - Sep 15, 2020.
Inspired by another concise data visualization, the author of this article has crafted and shared the code for a heatmap which visualizes the COVID-19 pandemic in the United States over time.
Coronavirus, COVID-19, Data Visualization, Python, Time Series, Visualization
- An Introduction to NLP and 5 Tips for Raising Your Game - Sep 11, 2020.
This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.
Beginners, NLP, Python
Free From MIT: Intro to Computer Science and Programming in Python - Sep 9, 2020.
This free introductory computer science and programming course is available via MIT's Open Courseware platform. It's a great resource for mastering the fundamentals of one of data science's major requirements.
Computer Science, Courses, MIT, Programming, Python
Modern Data Science Skills: 8 Categories, Core Skills, and Hot Skills - Sep 8, 2020.
We analyze the results of the Data Science Skills poll, including 8 categories of skills, 13 core skills that over 50% of respondents have, the emerging/hot skills that data scientists want to learn, and what is the top skill that Data Scientists want to learn.
Communication, Data Preparation, Data Science Skills, Data Visualization, Excel, GitHub, Mathematics, Poll, Python, Reinforcement Learning, scikit-learn, SQL, Statistics
- 4 Tricks to Effectively Use JSON in Python - Sep 8, 2020.
Working with JSON in Python is a breeze, this will get you started right away.
Programming, Python, Tips
- 10 Things You Didn’t Know About Scikit-Learn - Sep 3, 2020.
Check out these 10 things you didn’t know about Scikit-Learn... until now.
Machine Learning, Python, scikit-learn
- Computer Vision Recipes: Best Practices and Examples - Sep 2, 2020.
This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.
Best Practices, Computer Vision, Microsoft, Python
- Which methods should be used for solving linear regression? - Sep 2, 2020.
As a foundational set of algorithms in any machine learning toolbox, linear regression can be solved with a variety of approaches. Here, we discuss. with with code examples, four methods and demonstrate how they should be used.
Gradient Descent, Linear Regression, numpy, Python, Statistics, SVD
- PyCaret 2.1 is here: What’s new? - Sep 1, 2020.
PyCaret is an open-source, low-code machine learning library in Python that automates the machine learning workflow. It is an end-to-end machine learning and model management tool that speeds up the machine learning experiment cycle and makes you 10x more productive. Read about what's new in PyCaret 2.1.
Machine Learning, PyCaret, Python
- Explainable and Reproducible Machine Learning Model Development with DALEX and Neptune - Aug 27, 2020.
With ML models serving real people, misclassified cases (which are a natural consequence of using ML) are affecting peoples’ lives and sometimes treating them very unfairly. It makes the ability to explain your models’ predictions a requirement rather than just a nice to have.
Dalex, Explainability, Explainable AI, Interpretability, Python, SHAP
- Working with Spark, Python or SQL on Azure Databricks - Aug 27, 2020.
Here we look at some ways to interchangeably work with Python, PySpark and SQL using Azure Databricks, an Apache Spark-based big data analytics service designed for data science and data engineering offered by Microsoft.
Apache Spark, Databricks, Microsoft Azure, Python, SQL
- Data Science Tools Illustrated Study Guides - Aug 25, 2020.
These data science tools illustrated guides are broken up into four distinct categories: data retrieval, data manipulation, data visualization, and engineering tips. Both online and PDF versions of these guides are available.
Cheat Sheet, Data Preprocessing, Data Processing, Data Science, Data Science Tools, Data Visualization, Python, R, SQL
- Rapid Python Model Deployment with FICO Xpress Insight - Aug 20, 2020.
The biggest hurdle in the use of data to create business value, is indeed the ability to operationalize analytics throughout the organization. Xpress Insight is geared to reduce the burden on IT and address their critical requirements while empowering business users to take ownership of decisions and change management.
AI, Deployment, FICO, Machine Learning, Optimization, Python
- Build Your Own AutoML Using PyCaret 2.0 - Aug 20, 2020.
In this post we present a step-by-step tutorial on how PyCaret can be used to build an Automated Machine Learning Solution within Power BI, thus allowing data scientists and analysts to add a layer of machine learning to their Dashboards without any additional license or software costs.
Automated Machine Learning, AutoML, Power BI, PyCaret, Python
The List of Top 10 Lists in Data Science - Aug 14, 2020.
The list of Top 10 lists that Data Scientists -- from enthusiasts to those who want to jump start a career -- must know to smoothly navigate a path through this field.
Algorithms, Data Science, Data Science Skills, Datasets, Influencers, LinkedIn, Python, Top 10
- Bring your Pandas Dataframes to life with D-Tale - Aug 13, 2020.
Bring your Pandas dataframes to life with D-Tale. D-Tale is an open-source solution for which you can visualize, analyze and learn how to code Pandas data structures. In this tutorial you'll learn how to open the grid, build columns, create charts and view code exports.
Data Exploration, Data Science, Data Visualization, Pandas, Python
- GitHub is the Best AutoML You Will Ever Need - Aug 12, 2020.
This article uses PyCaret 2.0, an open source, low-code machine learning library in Python to develop a simple AutoML solution and deploy it as a Docker container using GitHub actions.
Automated Machine Learning, AutoML, GitHub, PyCaret, Python
- Will Reinforcement Learning Pave the Way for Accessible True Artificial Intelligence? - Aug 11, 2020.
Python Machine Learning, Third Edition covers the essential concepts of reinforcement learning, starting from its foundations, and how RL can support decision making in complex environments. Read more on the topic from the book's author Sebastian Raschka.
Machine Learning, Packt Publishing, Python, Reinforcement Learning, Sebastian Raschka
Setting Up Your Data Science & Machine Learning Capability in Python - Aug 4, 2020.
With the rich and dynamic ecosystem of Python continuing to be a leading programming language for data science and machine learning, establishing and maintaining a cost-effective development environment is crucial to your business impact. So, do you rent or buy? This overview considers the hidden and obvious factors involved in selecting and implementing your Python platform.
Cloud Computing, Data Science, Machine Learning, Python, Saturn Cloud
- Announcing PyCaret 2.0 - Aug 3, 2020.
PyCaret 2.0 has been released! Find out about all of the updates and see examples of how to use them right here.
Machine Learning, PyCaret, Python
- The Machine Learning Field Guide - Aug 3, 2020.
This straightforward guide offers a structured overview of all machine learning prerequisites needed to start working on your project, including the complete data pipeline from importing and cleaning data to modelling and production.
Data Preparation, Machine Learning, Pandas, Predictive Modeling, Python
- Fuzzy Joins in Python with d6tjoin - Jul 31, 2020.
Combining different data sources is a time suck! d6tjoin is a python library that lets you join pandas dataframes quickly and efficiently.
Data Processing, Pandas, Python
- Scaling Computer Vision Models with Dataflow - Jul 31, 2020.
Scaling Machine Learning models is hard and expensive. We will shortly introduce the Google Cloud service Dataflow, and how it can be used to run predictions on millions of images in a serverless way.
Computer Vision, Dataflow, Google, Python, Scalability
- A Complete Guide To Survival Analysis In Python, part 3 - Jul 30, 2020.
Concluding this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter based on different groups, a Log-Rank test, and Cox Regression, all with examples and shared code.
Jupyter, Python, Regression, Statistics, Survival Analysis
- KDnuggets™ News 20:n29, Jul 29: Easy Guide To Data Preprocessing In Python; Building a better Spark UI; Computational Algebra for Coders: The Free Course - Jul 29, 2020.
An easy guide to data pre-processing in Python; Monitoring Apache Spark with a better Spark UI; Computational Linear Algebra for Coders: the free course; Labelling data with Snorkel; Bayesian Statistics.
Apache Spark, Bayesian, Data Preprocessing, Linear Algebra, Python
- Building a Content-Based Book Recommendation Engine - Jul 28, 2020.
In this blog, we will see how we can build a simple content-based recommender system using Goodreads data.
Python, Recommendation Engine, Recommender Systems
- Labelling Data Using Snorkel - Jul 24, 2020.
In this tutorial, we walk through the process of using Snorkel to generate labels for an unlabelled dataset. We will provide you examples of basic Snorkel components by guiding you through a real clinical application of Snorkel.
Data Labeling, Data Science, Deep Learning, Machine Learning, NLP, Python
- Powerful CSV processing with kdb+ - Jul 23, 2020.
This article provides a glimpse into the available tools to work with CSV files and describes how kdb+ and its query language q raise CSV processing to a new level of performance and simplicity.
Data Analysis, Data Processing, Python
- Apache Spark Cluster on Docker - Jul 22, 2020.
Build your own Apache Spark cluster in standalone mode on Docker with a JupyterLab interface.
Apache Spark, Data Engineering, Docker, Jupyter, Python
- KDnuggets™ News 20:n28, Jul 22: Data Science MOOCs are too Superficial; The Bitter Lesson of Machine Learning - Jul 22, 2020.
Data Science MOOCs are too Superficial; The Bitter Lesson of Machine Learning; Building a REST API with Tensorflow Serving (Part 1); 3 Advanced Python Features You Should Know; Understanding How Neural Networks Think;
API, Data Science, Machine Learning, MOOC, Neural Networks, Python, Richard Sutton, TensorFlow
- Building a REST API with Tensorflow Serving (Part 2) - Jul 21, 2020.
This post is the second part of the tutorial of Tensorflow Serving in order to productionize Tensorflow objects and build a REST API to make calls to them.
API, Docker, Keras, Python, TensorFlow
- Recurrent Neural Networks (RNN): Deep Learning for Sequential Data - Jul 20, 2020.
Recurrent Neural Networks can be used for a number of ways such as detecting the next word/letter, forecasting financial asset prices in a temporal space, action modeling in sports, music composition, image generation, and more.
Deep Learning, Python, Recurrent Neural Networks, Sequences, TensorFlow
- How to Handle Dimensions in NumPy - Jul 20, 2020.
Learn how to deal with Numpy matrix dimensionality using np.reshape, np.newaxis and np.expand_dims, illustrated with Python code.
Python
- 3 Advanced Python Features You Should Know - Jul 16, 2020.
As a Data Scientist, you are already spending most of your time getting your data ready for prime time. Follow these real-world scenarios to learn how to leverage the advanced techniques in Python of list comprehension, Lambda expressions, and the Map function to get the job done faster.
Pandas, Programming, Python, Tips
- Building a REST API with Tensorflow Serving (Part 1) - Jul 15, 2020.
Part one of a tutorial to teach you how to build a REST API around functions or saved models created in Tensorflow. With Tensorflow Serving and Docker, defining endpoint URLs and sending HTTP requests is simple.
API, Keras, Python, TensorFlow
- KDnuggets™ News 20:n27, Jul 15: Great explanation of Calculus, the Key to Deep Learning; 8 data-driven reasons to learn Python - Jul 15, 2020.
We bring you free MIT courses on Calculus, which is the key to understanding Deep Learning - check this amazing explanation of an integral and dx; 8 data-driven reasons to learn Python; How to get and analyze Financial data with Python; Free ebook: The Foundations of Data Science and more.
Data Science, Free ebook, Mathematics, Online Education, Python
- A Complete Guide To Survival Analysis In Python, part 2 - Jul 14, 2020.
Continuing with the second of this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter theory as well as the Nelson-Aalen fitter theory, both with examples and shared code.
Python, Statistics, Survival Analysis
- PyTorch LSTM: Text Generation Tutorial - Jul 13, 2020.
Key element of LSTM is the ability to work with sequences and its gating mechanism.
LSTM, Natural Language Generation, NLP, Python, PyTorch
- Why Learn Python? Here Are 8 Data-Driven Reasons - Jul 10, 2020.
Through this blog, I will list out the major reasons why you should learn Python and the 8 major data-driven reasons for learning it.
Data Science, Programming, Programming Languages, Python
- 5 Things You Don’t Know About PyCaret - Jul 9, 2020.
In comparison with the other open source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with a few words only.
Machine Learning, PyCaret, Python
- Learn Python, ML, Deep Learning, Data Visualization and more in Italy with BIG DIVE - Jul 9, 2020.
Do you want to learn or upgrade your data data proficiency and push your career forward? This year, under the umbrella of BIG DIVE, TOP-IX presents four full-time 1-week courses from beginner to advanced levels. Read more and register now.
Courses, Deep Learning, Italy, Python
- Pull and Analyze Financial Data Using a Simple Python Package - Jul 9, 2020.
We demonstrate a simple Python script/package to help you pull financial data (all the important metrics and ratios that you can think of) and plot them.
Finance, Pandas, Python
- Spam Filter in Python: Naive Bayes from Scratch - Jul 8, 2020.
In this blog post, learn how to build a spam filter using Python and the multinomial Naive Bayes algorithm, with a goal of classifying messages with a greater than 80% accuracy.
Classification, Naive Bayes, Python, Text Classification
- KDnuggets™ News 20:n26, Jul 8: Speed up Your Numpy and Pandas; A Layman’s Guide to Data Science; Getting Started with TensorFlow 2 - Jul 8, 2020.
Speed up your Numpy and Pandas with NumExpr Package; A Layman's Guide to Data Science. Part 3: Data Science Workflow; Getting Started with TensorFlow 2; Feature Engineering in SQL and Python: A Hybrid Approach; Deploy Machine Learning Pipeline on AWS Fargate
AWS, Data Science, Deep Learning, numpy, Pandas, Python, TensorFlow, Workflow
A Complete Guide To Survival Analysis In Python, part 1 - Jul 7, 2020.
This three-part series covers a review with step-by-step explanations and code for how to perform statistical survival analysis used to investigate the time some event takes to occur, such as patient survival during the COVID-19 pandemic, the time to failure of engineering products, or even the time to closing a sale after an initial customer contact.
Python, Statistics, Survival Analysis
Exploratory Data Analysis on Steroids - Jul 6, 2020.
This is a central aspect of Data Science, which sometimes gets overlooked. The first step of anything you do should be to know your data: understand it, get familiar with it. This concept gets even more important as you increase your data volume: imagine trying to parse through thousands or millions of registers and make sense out of them.
Data Analysis, Data Exploration, Data Preparation, Pandas, Python
Feature Engineering in SQL and Python: A Hybrid Approach - Jul 2, 2020.
Set up your workstation, reduce workplace clutter, maintain a clean namespace, and effortlessly keep your dataset up-to-date.
Feature Engineering, Python, SQL
Getting Started with TensorFlow 2 - Jul 2, 2020.
Learn about the latest version of TensorFlow with this hands-on walk-through of implementing a classification problem with deep learning, how to plot it, and how to improve its results.
Advice, Beginners, Deep Learning, Python, Regularization, TensorFlow
- PyTorch Multi-GPU Metrics Library and More in New PyTorch Lightning Release - Jul 2, 2020.
PyTorch Lightning, a very light-weight structure for PyTorch, recently released version 0.8.1, a major milestone. With incredible user adoption and growth, they are continuing to build tools to easily do AI research.
GPU, Metrics, Python, PyTorch, PyTorch Lightning
Speed up your Numpy and Pandas with NumExpr Package - Jul 1, 2020.
We show how to significantly speed up your mathematical calculations in Numpy and Pandas using a small library.
numpy, Pandas, Python
- Data Cleaning: The secret ingredient to the success of any Data Science Project - Jul 1, 2020.
With an uncleaned dataset, no matter what type of algorithm you try, you will never get accurate results. That is why data scientists spend a considerable amount of time on data cleaning.
Data Cleaning, Data Preparation, Data Science, Outliers, Python
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
About KDnuggets, Data Science Platform, Poll, Python, R
- Machine Learning in Dask - Jun 22, 2020.
In this piece, we’ll see how we can use Dask to work with large datasets on our local machines.
Dask, Machine Learning, Pandas, Python
4 Free Math Courses to do and Level up your Data Science Skills - Jun 22, 2020.
Just as there is no Data Science without data, there's no science in data without mathematics. Strengthening your foundational skills in math will level you up as a data scientist that will enable you to perform with greater expertise.
Bayesian, Coursera, edX, Inference, Linear Algebra, Mathematics, Online Education, Principal component analysis, Probability, Python, Statistics
- How to Deal with Missing Values in Your Dataset - Jun 22, 2020.
In this article, we are going to talk about how to identify and treat the missing values in the data step by step.
Data Preparation, Data Preprocessing, Missing Values, Python
The Most Important Fundamentals of PyTorch you Should Know - Jun 18, 2020.
PyTorch is a constantly developing deep learning framework with many exciting additions and features. We review its basic elements and show an example of building a simple Deep Neural Network (DNN) step-by-step.
Deep Learning, Neural Networks, Python, PyTorch, Tensor
- LightGBM: A Highly-Efficient Gradient Boosting Decision Tree - Jun 18, 2020.
LightGBM is a histogram-based algorithm which places continuous values into discrete bins, which leads to faster training and more efficient memory usage. In this piece, we’ll explore LightGBM in depth.
Decision Trees, Gradient Boosting, Machine Learning, Python
- KDnuggets™ News 20:n24, Jun 17: Easy Speech-to-Text with Python; Data Distributions Overview; Java for Data Scientists - Jun 17, 2020.
Also: Deploy a Machine Learning Pipeline to the Cloud Using a Docker Container; Five Cognitive Biases In Data Science (And how to avoid them); Understanding Machine Learning: The Free eBook; Simplified Mixed Feature Type Preprocessing in Scikit-Learn with Pipelines; A Complete guide to Google Colab for Deep Learning
Data Preprocessing, Data Science, Data Scientist, Distribution, Docker, Free ebook, Java, K-means, NLP, Python, Speech
- Simplified Mixed Feature Type Preprocessing in Scikit-Learn with Pipelines - Jun 16, 2020.
There is a quick and easy way to perform preprocessing on mixed feature type data in Scikit-Learn, which can be integrated into your machine learning pipelines.
Data Preprocessing, Pipeline, Python, scikit-learn
- Deploy a Machine Learning Pipeline to the Cloud Using a Docker Container - Jun 12, 2020.
In this tutorial, we will use a previously-built machine learning pipeline and Flask app to demonstrate how to deploy a machine learning pipeline as a web app using the Microsoft Azure Web App Service.
Cloud, Docker, Machine Learning, Pipeline, PyCaret, Python
Easy Speech-to-Text with Python - Jun 10, 2020.
In this blog, I am demonstrating how to convert speech to text using Python. This can be done with the help of the “Speech Recognition” API and “PyAudio” library.
NLP, Python, Speech
Natural Language Processing with Python: The Free eBook - Jun 8, 2020.
This free eBook is an introduction to natural language processing, and to NLTK, one of the most prevalent Python NLP libraries.
Free ebook, NLP, NLTK, Python
Deep Learning for Detecting Pneumonia from X-ray Images - Jun 5, 2020.
This article covers an end to end pipeline for pneumonia detection from X-ray images.
Deep Learning, Healthcare, Image Recognition, Python
- Machine Learning Experiment Tracking - Jun 4, 2020.
Why is experiment tracking so important for doing real world machine learning?
Experimentation, Machine Learning, Python
- Introduction to Convolutional Neural Networks - Jun 3, 2020.
The article focuses on explaining key components in CNN and its implementation using Keras python library.
Convolutional Neural Networks, Keras, Neural Networks, Python
Model Evaluation Metrics in Machine Learning - May 28, 2020.
A detailed explanation of model evaluation metrics to evaluate a classification machine learning model.
Classification, Confusion Matrix, Machine Learning, Metrics, Python, Regression
- Taming Complexity in MLOps - May 28, 2020.
A greatly expanded v2.0 of the open-source Orbyter toolkit helps data science teams continue to streamline machine learning delivery pipelines, with an emphasis on seamless deployment to production.
Best Practices, Docker, MLOps, Python
- KDnuggets™ News 20:n21, May 27: The Best NLP with Deep Learning Course is Free; Your First Machine Learning Web App - May 27, 2020.
Also: Python For Everybody: The Free eBook; Complex logic at breakneck speed: Try Julia for data science; An easy guide to choose the right Machine Learning algorithm; Dataset Splitting Best Practices in Python; Appropriately Handling Missing Values for Statistical Modelling and Prediction
Algorithms, Course, Deep Learning, Free ebook, Julia, Machine Learning, NLP, Python
- Dataset Splitting Best Practices in Python - May 26, 2020.
If you are splitting your dataset into training and testing data you need to keep some things in mind. This discussion of 3 best practices to keep in mind when doing so includes demonstration of how to implement these particular considerations in Python.
Datasets, Python, scikit-learn, Training Data, Validation
- 10 Useful Machine Learning Practices For Python Developers - May 25, 2020.
While you may be a data scientist, you are still a developer at the core. This means your code should be skillful. Follow these 10 tips to make sure you quickly deliver bug-free machine learning solutions.
Best Practices, Machine Learning Engineer, Python
Python For Everybody: The Free eBook - May 25, 2020.
Get back to fundamentals with this free eBook, Python For Everybody, approaching the learning of programming from a data analysis perspective.
Algorithms, Free ebook, Programming, Python
Build and deploy your first machine learning web app - May 22, 2020.
A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret.
App, Flask, Heroku, Machine Learning, Modeling, Open Source, Pipeline, PyCaret, Python
- Dimensionality Reduction with Principal Component Analysis (PCA) - May 21, 2020.
This article focuses on design principles of the PCA algorithm for dimensionality reduction and its implementation in Python from scratch.
Dimensionality Reduction, numpy, PCA, Python
- Pandas in action! - May 20, 2020.
Pandas is instantly familiar to anyone who’s used spreadsheet software, whether that’s Google Sheets or good old Excel. It’s got columns, it’s got grids, it’s got rows; but pandas is far more powerful. Save 40% with code nlkdpandas40 on this book, and other Manning books and videos.
Book, Manning, Pandas, Python
- Complex logic at breakneck speed: Try Julia for data science - May 20, 2020.
We show a comparative performance benchmarking of Julia with an equivalent Python code to show why Julia is great for data science and machine learning.
Benchmark, Data Science, Julia, numpy, Python
- Easy Text-to-Speech with Python - May 18, 2020.
Python comes with a lot of handy and easily accessible libraries and we’re going to look at how we can deliver text-to-speech with Python in this article.
NLP, Python, Speech
- 5 Great New Features in Scikit-learn 0.23 - May 15, 2020.
Check out 5 new features of the latest Scikit-learn release, including the ability to visualize estimators in notebooks, improvements to both k-means and gradient boosting, some new linear model implementations, and sample weight support for a pair of existing regressors.
Gradient Boosting, Jupyter, K-means, Machine Learning, Python, Regression, scikit-learn
- Machine Learning in Power BI using PyCaret - May 12, 2020.
Check out this step-by-step tutorial for implementing machine learning in Power BI within minutes.
Clustering, K-means, Machine Learning, Microsoft, Power BI, PyCaret, Python
- Text Mining in Python: Steps and Examples - May 12, 2020.
The majority of data exists in the textual form which is a highly unstructured format. In order to produce meaningful insights from the text data then we need to follow a method called Text Analysis.
NLP, Python, Text Mining
- Hyperparameter Optimization for Machine Learning Models - May 7, 2020.
Check out this comprehensive guide to model optimization techniques.
Hyperparameter, Machine Learning, Modeling, Optimization, Python
- Explaining “Blackbox” Machine Learning Models: Practical Application of SHAP - May 6, 2020.
Train a "blackbox" GBM model on a real dataset and make it explainable with SHAP.
Explainability, Interpretability, Python, SHAP
- KDnuggets™ News 20:n18, May 6: Five Cool Python Libraries for Data Science; NLP Recipes: Best Practices - May 6, 2020.
5 cool Python libraries for Data Science; NLP Recipes: Best Practices and Examples; Deep Learning: The Free eBook; Demystifying the AI Infrastructure Stack; and more.
Deep Learning, GIS, NLP, Python
- Getting Started with Spectral Clustering - May 5, 2020.
This post will unravel a practical example to illustrate and motivate the intuition behind each step of the spectral clustering algorithm.
Clustering, Machine Learning, Python
- Optimize Response Time of your Machine Learning API In Production - May 1, 2020.
This article demonstrates how building a smarter API serving Deep Learning models minimizes the response time.
API, Machine Learning, Optimization, Production, Python
Natural Language Processing Recipes: Best Practices and Examples - May 1, 2020.
Here is an overview of another great natural language processing resource, this time from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.
Best Practices, Microsoft, NLP, Python
Five Cool Python Libraries for Data Science - Apr 30, 2020.
Check out these 5 cool Python libraries that the author has come across during an NLP project, and which have made their life easier.
Data Science, NLP, Python
Coronavirus COVID-19 Genome Analysis using Biopython - Apr 29, 2020.
So in this article, we will interpret, analyze the COVID-19 DNA sequence data and try to get as many insights regarding the proteins that made it up. Later will compare COVID-19 DNA with MERS and SARS and we’ll understand the relationship among them.
Analysis, Coronavirus, COVID-19, Python
- Announcing PyCaret 1.0.0 - Apr 21, 2020.
An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.
Machine Learning, Modeling, Open Source, PyCaret, Python
- The Benefits & Examples of Using Apache Spark with PySpark - Apr 21, 2020.
Apache Spark runs fast, offers robust, distributed, fault-tolerant data objects, and integrates beautifully with the world of machine learning and graph analytics. Learn more here.
Apache Spark, Data Management, Python, SQL
- Dockerize Jupyter with the Visual Debugger - Apr 17, 2020.
A step by step guide to enable and use visual debugging in Jupyter in a docker container.
Data Science, Docker, Jupyter, Python
- Better notebooks through CI: automatically testing documentation for graph machine learning - Apr 16, 2020.
In this article, we’ll walk through the detailed and helpful continuous integration (CI) that supports us in keeping StellarGraph’s demos current and informative.
Graphs, Integration, Jupyter, Machine Learning, Python, Software Engineering
- Top KDnuggets tweets, Apr 08-14: Mathematics for #MachineLearning: The Free eBook – KDnuggets - Apr 15, 2020.
Also Exploratory Data Analysis for Natural Language Processing: A Complete Guide to Python Tools; A professor with 20 year experience to all high school seniors (and their parents). If you were planning to enroll in college next fall - don't.
Coronavirus, Education, Mathematics, NLP, Python, Top tweets
- Pandas in action - Apr 15, 2020.
Pandas is instantly familiar to anyone who’s used spreadsheet software, whether that’s Google Sheets or good old Excel. It’s got columns, it’s got grids, it’s got rows; but pandas is far more powerful. Save 40% with code nlkdpandas40 on this book, and other Manning books and videos.
Book, Manning, Pandas, Python
- Visualizing Decision Trees with Python (Scikit-learn, Graphviz, Matplotlib) - Apr 15, 2020.
Learn about how to visualize decision trees using matplotlib and Graphviz.
Algorithms, Decision Trees, Matplotlib, Python, Visualization
- KDnuggets™ News 20:n15, Apr 15: How to Do Hyperparameter Tuning on Any Python Script; 10 Must-read Machine Learning Articles - Apr 15, 2020.
Learn how to do hyperparameter tuning on python ML scripts; Read 10 must-read Machine Learning Articles; Understand the process for Data Science project review; see how data science is used to understand COVID-19; and stay safe and healthy!
Hyperparameter, Machine Learning, Python, Research
- Free Metis Corporate Training Series: Intro to Python, Continued - Apr 14, 2020.
Metis Corporate Training is offering Intro to Python, a free, live online training series specially created for business professionals, and an excellent way for a team to begin their Python journey. Classes are taught live, and participants will be able to ask questions in real time. Register now.
Metis, Online Education, Python
- Build PyTorch Models Easily Using torchlayers - Apr 9, 2020.
torchlayers aims to do what Keras did for TensorFlow, providing a higher-level model-building API and some handy defaults and add-ons useful for crafting PyTorch neural networks.
API, Keras, Neural Networks, Python, PyTorch
How to Do Hyperparameter Tuning on Any Python Script in 3 Easy Steps - Apr 8, 2020.
With your machine learning model in Python just working, it's time to optimize it for performance. Follow this guide to setup automated tuning using any optimization library in three steps.
Hyperparameter, Machine Learning, Optimization, Python
- KDnuggets™ News 20:n14, Apr 8: Free Mathematics for Machine Learning eBook; Epidemiology Courses for Data Scientists - Apr 8, 2020.
Stop Hurting Your Pandas!; Python for data analysis... is it really that simple?!?; Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs; Build an app to generate photorealistic faces using TensorFlow and Streamlit; 5 Ways Data Scientists Can Help Respond to COVID-19 and 5 Actions to Avoid
Anomaly Detection, Data Analysis, Data Science, Free ebook, Healthcare, Mathematics, MOOC, Pandas, Python
- Simple Question Answering (QA) Systems That Use Text Similarity Detection in Python - Apr 7, 2020.
How exactly are smart algorithms able to engage and communicate with us like humans? The answer lies in Question Answering systems that are built on a foundation of Machine Learning and Natural Language Processing. Let's build one here.
NLP, Python, Question answering, Similarity, Text Analytics
- Build an app to generate photorealistic faces using TensorFlow and Streamlit - Apr 7, 2020.
We’ll show you how to quickly build a Streamlit app to synthesize celebrity faces using GANs, Tensorflow, and st.cache.
App, GANs, Generative Adversarial Network, Python, Streamlit, TensorFlow
Stop Hurting Your Pandas! - Apr 3, 2020.
This post will address the issues that can arise when Pandas slicing is used improperly. If you see the warning that reads "A value is trying to be set on a copy of a slice from a DataFrame", this post is for you.
Pandas, Programming, Python
- Free Metis Corporate Training Series: Intro to Python - Apr 2, 2020.
Metis Corporate Training is offering Intro to Python, a free, live online training series specially created for business professionals, and an excellent way for a team to begin their Python journey. Classes are taught live, and participants will be able to ask questions in real time. Register now.
Metis, Online Education, Python
Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
Data Analysis, Pandas, Python, R, SQL
- Introduction to the K-nearest Neighbour Algorithm Using Examples - Apr 1, 2020.
Read this concise summary of KNN, a supervised and pattern classification learning algorithm which helps us find which class the new input belongs to when k nearest neighbours are chosen and distance is calculated between them.
Algorithms, K-nearest neighbors, Machine Learning, Python, scikit-learn
- KDnuggets™ News 20:n13, Apr 1: Effective visualizations for pandemic storytelling; Machine learning for time series forecasting - Apr 1, 2020.
This week, read about the power of effective visualizations for pandemic storytelling; see how (not) to use machine learning for time series forecasting; learn about a deep learning breakthrough: a sub-linear deep learning algorithm that does not need a GPU?; familiarize yourself with how to painlessly analyze your time series; check out what can we learn from the latest coronavirus trends; and... KDnuggets topics?!? Also, much more.
Coronavirus, Data Visualization, Deep Learning, Distributed, Machine Learning, Python, Time Series
- How To Painlessly Analyze Your Time Series - Mar 26, 2020.
The Matrix Profile is a powerful tool to help solve this dual problem of anomaly detection and motif discovery. Matrix Profile is robust, scalable, and largely parameter-free: we’ve seen it work for a wide range of metrics including website user data, order volume and other business-critical applications.
Anomaly Detection, API, Python, Time Series
- Evaluating Ray: Distributed Python for Massive Scalability - Mar 25, 2020.
If your team has started using Ray and you’re wondering what it is, this post is for you. If you’re wondering if Ray should be part of your technical strategy for Python-based applications, especially ML and AI, this post is for you.
Domino, Python, Scalability
- Build an Artificial Neural Network From Scratch: Part 2 - Mar 20, 2020.
The second article in this series focuses on building an Artificial Neural Network using the Numpy Python library.
Neural Networks, numpy, Python
- The 4 Best Jupyter Notebook Environments for Deep Learning - Mar 19, 2020.
Many cloud providers, and other third-party services, see the value of a Jupyter notebook environment which is why many companies now offer cloud hosted notebooks that are hosted on the cloud. Let's have a look at 3 such environments.
Deep Learning, Google Colab, Jupyter, Python, Saturn Cloud
- Exploring the Adoption of Python in the Workplace – Free Metis Corporate Training Webinar - Mar 18, 2020.
Metis will break down Python for data science and analytics, explain what is driving adoption in the field, and discuss how industries and companies are reacting to the shift.
Industry, Metis, Python
- Five Interesting Data Engineering Projects - Mar 17, 2020.
As the role of the data engineer continues to grow in the field of data science, so are the many tools being developed to support wrangling all that data. Five of these tools are reviewed here (along with a few bonus tools) that you should pay attention to for your data pipeline work.
Dask, Data Engineering, dbt, DVC, Python