- Essential Math for Data Science: Information Theory
- Jan 15, 2021.
In the context of machine learning, some of the concepts of information theory are used to characterize or compare probability distributions. Read up on the underlying math to gain a solid understanding of relevant aspects of information theory.
Tags: Data Science, Mathematics
- K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines
, by Jakub Adamczyk - Jan 15, 2021.
K-means clustering is a powerful algorithm for similarity searches, and Facebook AI Research's faiss library is turning out to be a speed champion. With only a handful of lines of code shared in this demonstration, faiss outperforms the implementation in scikit-learn in speed and accuracy.
Tags: Algorithms, K-means, Machine Learning, scikit-learn
- Cleaner Data Analysis with Pandas Using Pipes
, by Soner Yıldırım - Jan 15, 2021.
Check out this practical guide on Pandas pipes.
Tags: Data Analysis, Data Cleaning, Pandas, Pipeline, Python
- Data Cleaning and Wrangling in SQL
, by Antonio Badia - Jan 14, 2021.
SQL is a foundational skill for data analysts but its application is sometimes limited within the data pipeline. However, SQL can be successfully used for many pre-processing tasks, such as data cleaning and wrangling, as demonstrated here by example.
Tags: Data Cleaning, Data Preparation, SQL
- Unsupervised Learning for Predictive Maintenance using Auto-Encoders
, by Kundaliya & Aggarwal - Jan 14, 2021.
This article outlines a machine learning approach to detect and diagnose anomalies in the context of machine maintenance, along with a number of introductory concepts, including: Introduction to machine maintenance; What is predictive maintenance?; Approaches for machine diagnosis; Machine diagnosis using machine learning
Tags: Autoencoder, Predictive Analytics, Predictive Maintenance, Unsupervised Learning
- Creating Good Meaningful Plots: Some Principles
- Jan 12, 2021.
Hera are some thought starters to help you create meaningful plots.
Tags: Charts, Data Visualization, Python, R
- Working With Sparse Features In Machine Learning Models
- Jan 12, 2021.
Sparse features can cause problems like overfitting and suboptimal results in learning models, and understanding why this happens is crucial when developing models. Multiple methods, including dimensionality reduction, are available to overcome issues due to sparse features.
Tags: Data Preparation, Feature Engineering, Machine Learning, Overfitting, Sparse data
- Cloud Data Warehouse is The Future of Data Storage
, by Nitin Kumar - Jan 12, 2021.
Today, cloud data storage accounts for 45% of all enterprise data and by Q2 2021, that number could grow to 53%. Now is the time to embrace cloud than now.
Tags: Cloud, Data Warehouse, Data Warehousing
- Attention mechanism in Deep Learning, Explained
, by Nagesh Chauhan - Jan 11, 2021.
Attention is a powerful mechanism developed to enhance the performance of the Encoder-Decoder architecture on neural network-based machine translation tasks. Learn more about how this process works and how to implement the approach into your work.
Tags: Attention, Deep Learning, Explained, LSTM, Machine Translation
- OpenAI Releases Two Transformer Models that Magically Link Language and Computer Vision
, by Jesus Rodriguez - Jan 11, 2021.
OpenAI has released two new transformer architectures that combine image and language tasks in an fun and almost magical way. Read more about them here.
Tags: Computer Vision, NLP, OpenAI, Transformer
- JupyterLab 3 is Here: Key reasons to upgrade now
, by Matthew Mayo - Jan 8, 2021.
Read about these 3 reasons for checking out JupyterLab 3 today.
Tags: Data Science, IDE, Jupyter, Programming
-
Best Python IDEs and Code Editors You Should Know
, by Claire D. Costa - Jan 8, 2021.
Developing machine learning algorithms requires implementing countless libraries and integrating many supporting tools and software packages. All this magic must be written by you in yet another tool -- the IDE -- that is fundamental to all your code work and can drive your productivity. These top Python IDEs and code editors are among the best tools available for you to consider, and are reviewed with their noteworthy features.
Tags: IDE, Jupyter, PyCharm, Python, Visual Studio Code
- Top 10 Computer Vision Papers 2020
, by Louis (What’s AI) Bouchard - Jan 8, 2021.
The top 10 computer vision papers in 2020 with video demos, articles, code, and paper reference.
Tags: AI, Computer Vision, Research
- Advice to aspiring Data Scientists – your most common questions answered
, by Roman Orac - Jan 7, 2021.
Embarking on a new career path can be daunting with many unknowns about how to get started and how to be successful. If you are aspiring to become a Data Scientist, then the answers to these common questions can help set you off on the right foot.
Tags: Advice, Career Advice, Data Scientist, Mathematics, Online Education, SQL
- 10 Underappreciated Python Packages for Machine Learning Practitioners
, by Vinay Uday Prabhu - Jan 7, 2021.
Here are 10 underappreciated Python packages covering neural architecture design, calibration, UI creation and dissemination.
Tags: Deployment, Neural Networks, Python, UI/UX
- CatalyzeX: A must-have browser extension for machine learning engineers and researchers
, by Himanshu Ragtah - Jan 6, 2021.
CatalyzeX is a free browser extension that finds code implementations for ML/AI papers anywhere on the internet (Google, Arxiv, Twitter, Scholar, and other sites).
Tags: Implementation, Machine Learning, Programming, Research
-
Learn Data Science for free in 2021
, by Ahmad Anis - Jan 6, 2021.
If you are considering starting a career path in machine learning and data science, then there is a great deal to learn theoretically, along with gaining practical skills in applying a broad range of techniques. This comprehensive learning plan will guide you to start on this path, and it is all available for free.
Tags: Data Science Education, Online Education
- MLOps: Model Monitoring 101
, by Saha & Bose - Jan 6, 2021.
Model monitoring using a model metric stack is essential to put a feedback loop from a deployed ML model back to the model building stage so that ML models can constantly improve themselves under different scenarios.
Tags: AI, Data Science, DevOps, Machine Learning, MLOps, Modeling
- Model Experiments, Tracking and Registration using MLflow on Databricks
, by Dash Desai - Jan 5, 2021.
This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.
Tags: Data Science, Databricks, DataOps, Experimentation, MLflow, MLOps, Modeling
- DeepMind’s MuZero is One of the Most Important Deep Learning Systems Ever Created
, by Jesus Rodriguez - Jan 4, 2021.
MuZero takes a unique approach to solve the problem of planning in deep learning models.
Tags: AlphaZero, Deep Learning, DeepMind, MuZero, Reinforcement Learning
-
All Machine Learning Algorithms You Should Know in 2021
, by Terence Shin - Jan 4, 2021.
Many machine learning algorithms exits that range from simple to complex in their approach, and together provide a powerful library of tools for analyzing and predicting patterns from data. If you are learning for the first time or reviewing techniques, then these intuitive explanations of the most popular machine learning models will help you kick off the new year with confidence.
Tags: Algorithms, Decision Tree, Explained, Gradient Boosting, K-nearest neighbors, Machine Learning, Naive Bayes, Regression, SVM