-
Master Data Science with Anaconda
As a special offer, you're receiving a 30-day FREE trial to Anaconda Notebooks and Learning. To unlock this offer, simply sign up for an Anaconda Nucleus account and use the promo code “NEWYEAR23" at checkout.
-
Top 10 Advanced Data Science SQL Interview Questions You Must Know How to Answer
In this article, we will give a list of commonly asked SQL interview questions to help you prepare for your coming technical interview.
By Sonia Jamil, Executive Database Admin (DBA) at Ufone on January 27, 2023 in SQL
-
7 SMOTE Variations for Oversampling
Best oversampling techniques for the imbalanced data.
-
Multi-modal deep learning in less than 15 lines of code
Learn how to easily build, iterate and deploy a state-of-the-art deep learning model to predict customer ratings with a declarative approach to machine learning.
-
An Introduction to Markov Chains
Markov chains are often used to model systems that exhibit memoryless behavior, where the system's future behavior is not influenced by its past behavior.
-
Top 8 Data Science Slack Communities to Join in 2023
Take your Data Science journey to the next level by joining these Slack communities in 2023.
-
Hyperparameter Optimization: 10 Top Python Libraries
Become familiar with some of the most popular Python libraries available for hyperparameter optimization.
-
The ChatGPT Cheat Sheet
Impress your friends and loved ones by perfecting your ChatGPT prompt engineering game with this incredibly useful resource.
-
KDnuggets News, January 25: ChatGPT as a Python Programming Assistant • Python and Machine Learning to Predict Football Match Winners
ChatGPT as a Python Programming Assistant • How to Use Python and Machine Learning to Predict Football Match Winners • 20 Questions (with Answers) to Detect Fake Data Scientists: ChatGPT Edition, Part 1 • From Data Collection to Model Deployment: 6 Stages of a Data Science Project • 5 Free Data Science Books You Must Read in 2023
-
-
How to Track the Location of an IP Address using Python
Learn how to geolocate an IP Address or a Domain Name using the python library named Ip2geotools.
-
5 Ways to Deal with the Lack of Data in Machine Learning
Effective solutions exist when you don't have enough data for your models. While there is no perfect approach, five proven ways will get your model to production.
-
Genetic Programming in Python: The Knapsack Problem
This article explores the knapsack problem. We will discuss why it is difficult to solve traditionally and how genetic programming can help find a "good enough" solution. We will then look at a Python implementation of this solution to test out for ourselves.
-
Learn how to design, measure and implement trustworthy A/B tests from leading experimentation expert Ronny Kohavi (ex-Amazon, Airbnb, Microsoft)
Leading expert Ronny Kohavi, drawing from his 20+ years of experience, will walk you through the ins and outs of experimentation, identifying key insights and working through live demos in his live course, Accelerating Innovation with A/B Testing, starting January 30th.
-
7 Best Libraries for Machine Learning Explained
Learn about machine learning libraries for building and deploying machine learning models.
-
Top Posts January 16-22: ChatGPT as a Python Programming Assistant
ChatGPT as a Python Programming Assistant • ChatGPT: Everything You Need to Know • Explainable AI: 10 Python Libraries for Demystifying Your Model’s Decisions • How to Use Python and Machine Learning to Predict Football Match Winners • 20 Questions (with Answers) to Detect Fake Data Scientists: ChatGPT Edition, Part 1
-
Setup and use JupyterHub (TLJH) on AWS EC2
JupyterHub is a multi-user, container-friendly version of the Jupyter Notebook. However, it can be difficult to setup. This blog post will make you less likely to run into issues in this 15+ step process.
-
5 Free Data Science Books You Must Read in 2023
Get your hands on these gems to learn Python, data analytics, machine learning, and deep learning.
-
From Data Collection to Model Deployment: 6 Stages of a Data Science Project
Here are 6 stages of a novel Data Science Project; From Data Collection to Model in Production, backed by research and examples.
-
Scaling Data Management Through Apache Gobblin
Software companies can manage big data at a hyper-scale on different infrastructure stacks using Apache Gobblin.
-
ChatGPT as a Python Programming Assistant
Is ChatGPT useful for Python programmers, specifically those of us who use Python for data processing, data cleaning, and building machine learning models? Let's give it a try and find out.
-
Encoding Categorical Features with MultiLabelBinarizer
Transform multi-label format into a binary matrix for multi-label classification.
-
Mastering String Transformations in RAPIDS libcudf
This post demonstrates how to skillfully transform strings columns with the libcudf general-purpose API. You’ll gain new knowledge on how to unlock peak performance using custom kernels and libcudf device-side utilities.
-
SQL and Data Integration: ETL and ELT
In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.
-
KDnuggets News, January 18: 7 Best Platforms to Practice SQL • Explainable AI: 10 Python Libraries for Demystifying Your Model’s Decisions
7 Best Platforms to Practice SQL • Explainable AI: 10 Python Libraries for Demystifying Your Model's Decisions • ChatGPT: Everything You Need to Know • Data Lakes and SQL: A Match Made in Data Heaven • Google Data Analytics Certification Review for 2023
-
How to Use Python and Machine Learning to Predict Football Match Winners
We will be learning web scraping and training supervised machine-learning algorithms to predict winning teams.
-
Things Aren’t Always Normal: Some of the “Other” Distributions
Learn about Gamma, Beta, and Bernoulli distributions with Python.
-
20 Questions (with Answers) to Detect Fake Data Scientists: ChatGPT Edition, Part 1
Can ChatGPT provide answers to data science questions to the same standard of humans? Check out this attempt to do so, and compare the answers to those from experts.
-
Fast-track your next move with in-demand data skills
DataCamp offers over 400 interactive courses, projects, and career tracks in the most popular data technologies such as Python, SQL, R, Power BI, and Tableau. Start today and save up to 67% on career-advancing learning.
-
Idiot’s Guide to Precision, Recall, and Confusion Matrix
Building Machine Learning models is fun, but making sure we build the best ones is what makes a difference. Follow this quick guide to appreciate how to effectively evaluate a classification model, especially for projects where accuracy alone is not enough.
-
Social User Authentication in Django Framework
Learn how to perform social user authentication in the Django web app using third-party services like Google.
-
12 Docker Commands Every Data Scientist Should Know
Looking to add Docker to your data science toolbox? Here’s a list of essential Docker commands to help you get started.
-
KDnuggets Top Posts for December 2022: 5 Python Projects for Data Science Portfolio
3 Free Machine Learning Courses for Beginners • The Complete Machine Learning Study Roadmap •Markdown Cheat Sheet • Learn Data Science From These GitHub Repositories • 7 Essential Cheat Sheets for Data Engineering • Scikit-Learn Cheat Sheet for Machine Learning • 7 Super Cheat Sheets You Need To Ace Machine Learning Interview
-
Data Lakes and SQL: A Match Made in Data Heaven
In this article, we will discuss the benefits of using SQL with a data lake and how it can help organizations unlock the full potential of their data.
-
Top Posts January 9-15: Python Matplotlib Cheat Sheets
Python Matplotlib Cheat Sheets • How to Select Rows and Columns in Pandas • 7 Best Platforms to Practice SQL • How to Perform Unit Testing in Python? • Google Data Analytics Certification Review
-
ChatGPT: Everything You Need to Know
All you need to know about ChatGPT: what it can do, how it works, and its limitations.
-
Explainable AI: 10 Python Libraries for Demystifying Your Model’s Decisions
Become familiar with some of the most popular Python libraries available for AI explainability.
-
Concepts You Should Know Before Getting Into Transformers
Learn about Input Embedding, Positional Encoding, Scaled Dot-Product Attention, Residual Connections, Mask, and Softmax function.
-
7 Best Platforms to Practice SQL
Looking to level up your SQL skills? Here's a list of the best platforms to practice SQL, ace your SQL interviews, and land your dream data role.
-
Top Posts January 2-8: Python Matplotlib Cheat Sheets
Python Matplotlib Cheat Sheets • Free Data Management with Data Science Learning with CS639 • How to Select Rows and Columns in Pandas Using [ ], .loc, iloc, .at and .iat • Creating a Web Application to Extract Topics from Audio with Python • More Data Science Cheatsheets
-
Overcome Your Data Quality Issues with Great Expectations
Bad data costs organizations money, reputation, and time. Hence it is very important to monitor and validate data quality continuously.
-
Approaches to Data Imputation
This guide will discuss what data imputation is as well as the types of approaches it supports.
-
Google Data Analytics Certification Review for 2023
What is the Google Data Analytics Certification? And, more importantly, is it still worth getting it in 2023?
-
KDnuggets News, January 11: Python Matplotlib Cheatsheets • More Data Science Cheatsheets • Data Science & Machine Learning Developments of 2022
Key Data Science, Machine Learning, AI and Analytics Developments of 2022 • Python Matplotlib Cheat Sheets • More Data Science Cheatsheets • Free Data Management with Data Science Learning with CS639 • Data-Driven Holiday Cheer: How Santa is Using Analytics to Make the Season Bright
-
RAPIDS cuDF for Accelerated Data Science on Google Colab
GPU-accelerated dataframe library that implements the familiar pandas API for processing and analyzing your data.
-
Topic Modeling Approaches: Top2Vec vs BERTopic
This post gives an overview of the strengths and differences of these approaches in extracting topics from text.
-
Creating Beautiful Histograms with Seaborn
Visualize the numerical distribution in a beautiful way.
-
-
Beginner’s Guide to Cloud Computing
Learn how cloud computing works, different types of models, top cloud platforms, and applications.
-
Performing a T-Test in Python
An introduction to the t-test with python implementation.
-
Where Collaboration Fails Around Data (And 4 Tips for Fixing It)
Data-driven organizations require complex collaboration between data teams and business stakeholders. Here are 4 proactive tips for reducing information asymmetries and achieving better collaboration.
-
How to Perform Unit Testing in Python?
Unit testing is an important part of the software development life cycle as it helps to ensure that code is correct and working as intended. This article aims to introduce the concept of unit testing in Python and provide a basic tutorial on how to write and run unit tests using a unittest module.
-
3 Things I Wish I Knew When I Started Data Science
Looking back and realizing how I was wrong about the data science career.
-
Creating a Web Application to Extract Topics from Audio with Python
A step-by-step tutorial to build and deploy a web application for topic modeling of a Spotify podcast.
-
Free Data Management with Data Science Learning with CS639
Learn Data Management with Data Science for FREE with CS639.
-
Python Lambda Functions, Explained
Learn the syntax and uses of the lambda function, which is an alternative to the regular Python function.
-
CCC Webinar: Best Practices When Using XML Articles in AI, Machine Learning and Text Mining Projects
Register now for this webinar on Jan. 12 to learn how to simplify the process of using scientific literature within your AI, machine learning, and text mining projects.
-
The Fast and Effective Way to Audit ML for Fairness
Is your model fair? Here's how to audit using the Aequitas Toolkit.
-
How to Merge Pandas DataFrames
Data merge is a common data processing activity. Learn how Pandas provide various ways to merge our data.
-
SQL With CSVs
Write SQL query to analyze CSV files using the simple command line tool.
-
A Solid Plan for Learning Data Science, Machine Learning, and Deep Learning
Check out this solid plan for learning Data Science, Machine Learning, and Deep Learning. The entire plan is currently available at no cost to KDnuggets readers.
-
Micro, Macro & Weighted Averages of F1 Score, Clearly Explained
Understanding the concepts behind the micro average, macro average, and weighted average of F1 score in multi-class classification with simple illustrations.
-
Natural Language Processing with spaCy
Learn to build NLP projects using spaCy.
-
Top Data Python Packages to Know in 2023
These Python packages would improve your data workflow.
-
Introduction to Multi-Armed Bandit Problems
Delve deeper into the concept of multi-armed bandits, reinforcement learning, and exploration vs. exploitation dilemma.
-
Python Matplotlib Cheat Sheets
Matplotlib is the most famous and commonly used plotting library in Python. It allows you to create clear and interactive visualizations that make your data easier to understand and your results more concrete.
-
Unsupervised Disentangled Representation Learning in Class Imbalanced Dataset Using Elastic Info-GAN
This рареr attempts to exploit primarily twо flaws in the Infо-GАN рареr while retаining the оther good qualities improvements.
-
12 Essential Commands for Streamlit
Learn about the most commonly used Streamlit commands and build a customized web application.
-
More Data Science Cheatsheets
It's time again to look at some data science cheatsheets. Here you can find a short selection of such resources which can cater to different existing levels of knowledge and breadth of topics of interest.
-
Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science
Data science is ever-evolving, so mastering its foundational technical and soft skills will help you be successful in a career as a Data Scientist, as well as pursue advance concepts, such as deep learning and artificial intelligence.
-
Top 38 Python Libraries for Data Science, Data Visualization & Machine Learning
This article compiles the 38 top Python libraries for data science, data visualization & machine learning, as best determined by KDnuggets staff.
-
The Zen of Python
Python is one of the programming languages that are very versatile and relatively easy to learn. Hence it is the choice of many new programmers, regardless of what area of tech they are interested in. It is particularly popular in all data science branches.
-
Key Data Science, Machine Learning, AI and Analytics Developments of 2022
It's the end of the year, and so it's time for KDnuggets to assemble a team of experts and get to the bottom of what the most important data science, machine learning, AI and analytics developments of 2022 were.
-
24 Best (and Free) Books To Understand Machine Learning
We have compiled a list of some of the best (and free) machine learning books that will prove helpful for everyone aspiring to build a career in the field.
-
A Guide to Train an Image Classification Model Using Tensorflow
Classify images at scale and with very high accuracy with the advent of machine learning and deep learning algorithms.
|