Tutorials, Overviews
DataCamp - Easiest Way to Learn Data Science
![]() Learning Python? Take this Intro to Python for Data Science Tutorial Now on Sale. |
![]() Learning R? Take this Intro to R for Data Science Tutorial Now on Sale. |
Latest:
-
Automated Anomaly Detection Using PyCaret - Apr 13, 2021.
Learn to automate anomaly detection using the open source machine learning library PyCaret. -
10 Real-Life Applications of Reinforcement Learning - Apr 12, 2021.
In this article, we’ll look at some of the real-world applications of reinforcement learning. -
Zero-Shot Learning: Can you classify an object without seeing it before? - Apr 12, 2021.
Developing machine learning models that can perform predictive functions on data it has never seen before has become an important research area called zero-shot learning. We tend to be pretty great at recognizing things in the world we never saw before, and zero-shot learning offers a possible path toward mimicking this powerful human capability. -
How to Apply Transformers to Any Length of Text - Apr 12, 2021.
Read on to find how to restore the power of NLP for long sequences. -
Interpretable Machine Learning: The Free eBook - Apr 9, 2021.
Interested in learning more about interpretability in machine learning? Check out this free eBook to learn about the basics, simple interpretable models, and strategies for interpreting more complex black box models. -
Deep Learning Recommendation Models (DLRM): A Deep Dive - Apr 9, 2021.
The currency in the 21st century is no longer just data. It's the attention of people. This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendation model, DLRM, which was open-sourced by Facebook in March 2019. -
NoSQL Explained: Understanding Key-Value Databases - Apr 8, 2021.
Among the four big NoSQL database types, key-value stores are probably the most popular ones due to their simplicity and fast performance. Let’s further explore how key-value stores work and what are their practical uses. -
A/B Testing: 7 Common Questions and Answers in Data Science Interviews, Part 2 - Apr 8, 2021.
In this second article in this series, we’ll continue to take an interview-driven approach by linking some of the most commonly asked interview questions to different components of A/B testing, including selecting ideas for testing, designing A/B tests, evaluating test results, and making ship or no ship decisions. -
E-commerce Data Analysis for Sales Strategy Using Python - Apr 7, 2021.
Check out this informative and concise case study applying data analysis using Python to a well-defined e-commerce scenario. -
Microsoft Research Trains Neural Networks to Understand What They Read - Apr 7, 2021.
The new models make inroads in a new areas of deep learning known as machine reading comprehension. -
Working With Time Series Using SQL - Apr 6, 2021.
This article is an overview of using SQL to manipulate time series data. -
How to Dockerize Any Machine Learning Application - Apr 6, 2021.
How can you -- an awesome Data Scientist -- also be known as an awesome software engineer? Docker. And these 3 simple steps to use it for your solutions over and over again. -
Automated Text Classification with EvalML - Apr 6, 2021.
Learn how EvalML leverages Woodwork, Featuretools and the nlp-primitives library to process text data and create a machine learning model that can detect spam text messages. -
The Best Machine Learning Frameworks & Extensions for TensorFlow - Apr 5, 2021.
Check out this curated list of useful frameworks and extensions for TensorFlow. -
How to deploy Machine Learning/Deep Learning models to the web - Apr 5, 2021.
The full value of your deep learning models comes from enabling others to use them. Learn how to deploy your model to the web and access it as a REST API, and begin to share the power of your machine learning development with the world. -
Awesome Tricks And Best Practices From Kaggle - Apr 5, 2021.
Easily learn what is only learned by hours of search and exploration. -
Shapash: Making Machine Learning Models Understandable - Apr 2, 2021.
Establishing an expectation for trust around AI technologies may soon become one of the most important skills provided by Data Scientists. Significant research investments are underway in this area, and new tools are being developed, such as Shapash, an open-source Python library that helps Data Scientists make machine learning models more transparent and understandable. -
What’s ETL? - Apr 2, 2021.
Discover what ETL is, and see in what ways it’s critical for data science. -
Easy AutoML in Python - Apr 1, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python. -
A/B Testing: 7 Common Questions and Answers in Data Science Interviews, Part 1 - Apr 1, 2021.
In this article, we’ll take an interview-driven approach by linking some of the most commonly asked interview questions to different components of A/B testing, including selecting ideas for testing, designing A/B tests, evaluating test results, and making ship or no ship decisions.
March:
- 3 More Free Top Notch Natural Language Processing Courses
- Introduction to the White-Box AI: the Concept of Interpretability
- Software Engineering Best Practices for Data Scientists
- Why So Many Data Scientists Quit Good Jobs at Great Companies
- Explainable Visual Reasoning: How MIT Builds Neural Networks that can Explain Themselves
- How to break a model in 20 days — a tutorial on production model analytics
- MongoDB in the Cloud: Three Solutions for 2021
- Overview of MLOps
- Multilingual CLIP with Huggingface + PyTorch Lightning
- Extraction of Objects In Images and Videos Using 5 Lines of Code
-
Top 10 Python Libraries Data Scientists should know in 2021 , by Terence Shin
So many Python libraries exist that offer powerful and efficient foundations for supporting your data science work and machine learning model development. While the list may seem overwhelming, there are certain libraries you should focus your time on, as they are some of the most commonly used today. - Rejection Sampling with Python
- Metric Matters, Part 2: Evaluating Regression Models
- Top YouTube Machine Learning Channels
- The Best Machine Learning Frameworks & Extensions for Scikit-learn
-
The Portfolio Guide for Data Science Beginners , by Navid Mashinchi
Whether you are an aspiring or seasoned Data Scientist, establishing a clear and well-designed online portfolio presence will help you stand out in the industry, and provide potential employers a powerful understanding of your work and capabilities. These tips will help you brainstorm and launch your first data science portfolio. - Teaching AI to See Like a Human
- Learning from machine learning mistakes
- How to build a DAG Factory on Airflow
-
More Data Science Cheatsheets , by Matthew Mayo
It's time again to look at some data science cheatsheets. Here you can find a short selection of such resources which can cater to different existing levels of knowledge and breadth of topics of interest. - How to frame the right questions to be answered using data
- A Simple Way to Time Code in Python
- Automating Machine Learning Model Optimization
- How to Begin Your NLP Journey
- Natural Language Processing Pipelines, Explained
- Metric Matters, Part 1: Evaluating Classification Models
- Data Validation and Data Verification – From Dictionary to Machine Learning
-
10 Amazing Machine Learning Projects of 2020 , by Anupam Chugh
So much progress in AI and machine learning happened in 2020, especially in the areas of AI-generating creativity and low-to-no-code frameworks. Check out these trending and popular machine learning projects released last year, and let them inspire your work throughout 2021. - Forget Telling Stories; Help People Navigate
- Kedro-Airflow: Orchestrating Kedro Pipelines with Airflow
-
Must Know for Data Scientists and Data Analysts: Causal Design Patterns , by Emily Riederer
Industry is a prime setting for observational causal inference, but many companies are blind to causal measurement beyond A/B tests. This formula-free primer illustrates analysis design patterns for measuring causal effects from observational data. -
Know your data much faster with the new Sweetviz Python library , by Francois Bertrand
One of the latest exploratory data analysis libraries is a new open-source Python library called Sweetviz, for just the purposes of finding out data types, missing information, distribution of values, correlations, etc. Find out more about the library and how to use it here. - A Beginner’s Guide to the CLIP Model
- The Inferential Statistics Data Scientists Should Know
-
A Machine Learning Model Monitoring Checklist: 7 Things to Track , by Dral & Samuylova
Once you deploy a machine learning model in production, you need to make sure it performs. In the article, we suggest how to monitor your models and open-source tools to use. - How to Speed Up Pandas with Modin
-
How To Overcome The Fear of Math and Learn Math For Data Science , by Arnuld On Data
Many aspiring Data Scientists, especially when self-learning, fail to learn the necessary math foundations. These recommendations for learning approaches along with references to valuable resources can help you overcome a personal sense of not being "the math type" or belief that you "always failed in math." - DeepMind’s AlphaFold & the Protein Folding Problem
- Understanding NoSQL Document Databases
- Beautiful decision tree visualizations with dtreeviz
- 11 Essential Code Blocks for Complete EDA (Exploratory Data Analysis)
- Speeding up Scikit-Learn Model Training
- Bayesian Hyperparameter Optimization with tune-sklearn in PyCaret
- Reducing the High Cost of Training NLP Models With SRU++
- Evaluating Object Detection Models Using Mean Average Precision
- 15 common mistakes data scientists make in Python (and how to fix them)
- Getting Started with Distributed Machine Learning with PyTorch and Ray
- Speech to Text with Wav2Vec 2.0
-
3 Mathematical Laws Data Scientists Need To Know , by Cornellius Yudha Wijaya
Machine learning and data science are founded on important mathematics in statistics and probability. A few interesting mathematical laws you should understand will especially help you perform better as a Data Scientist, including Benford's Law, the Law of Large Numbers, and Zipf's Law. - Google’s Model Search is a New Open Source Framework that Uses Neural Networks to Build Neural Networks
-
Top YouTube Channels for Data Science , by Matthew Mayo
Have a look at the top 15 YouTube channels for data science by number of subscribers, along with some additional data on the channels to help you decide if they may have some content useful for you.