- Snowflake and Saturn Cloud Partner To Bring 100x Faster Data Science to Millions of Python Users
- Jan 15, 2021.
Snowflake the cloud data platform, is partnering, integrating products, and pursuing a joint go-to-market with Saturn Cloud to help data science teams get 100x faster results. Read more about developments and how to get started here.
Tags: Data Science, Python, Saturn Cloud, Snowflake
- Essential Math for Data Science: Information Theory
- Jan 15, 2021.
In the context of machine learning, some of the concepts of information theory are used to characterize or compare probability distributions. Read up on the underlying math to gain a solid understanding of relevant aspects of information theory.
Tags: Data Science, Mathematics
- K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines
, by Jakub Adamczyk - Jan 15, 2021.
K-means clustering is a powerful algorithm for similarity searches, and Facebook AI Research's faiss library is turning out to be a speed champion. With only a handful of lines of code shared in this demonstration, faiss outperforms the implementation in scikit-learn in speed and accuracy.
Tags: Algorithms, K-means, Machine Learning, scikit-learn
- Cleaner Data Analysis with Pandas Using Pipes
, by Soner Yıldırım - Jan 15, 2021.
Check out this practical guide on Pandas pipes.
Tags: Data Analysis, Data Cleaning, Pandas, Pipeline, Python
- 8 New Tools I Learned as a Data Scientist in 2020
- Jan 14, 2021.
The author shares the data science tools learned while making the move from Docker to Live Deployments.
Tags: Data Science, Data Science Tools, Data Scientist, Deployment, Docker, Kubernetes, MLflow, NoSQL
- Data Cleaning and Wrangling in SQL
, by Antonio Badia - Jan 14, 2021.
SQL is a foundational skill for data analysts but its application is sometimes limited within the data pipeline. However, SQL can be successfully used for many pre-processing tasks, such as data cleaning and wrangling, as demonstrated here by example.
Tags: Data Cleaning, Data Preparation, SQL
- Unsupervised Learning for Predictive Maintenance using Auto-Encoders
, by Kundaliya & Aggarwal - Jan 14, 2021.
This article outlines a machine learning approach to detect and diagnose anomalies in the context of machine maintenance, along with a number of introductory concepts, including: Introduction to machine maintenance; What is predictive maintenance?; Approaches for machine diagnosis; Machine diagnosis using machine learning
Tags: Autoencoder, Predictive Analytics, Predictive Maintenance, Unsupervised Learning
- My Data Science Learning Journey So Far
- Jan 13, 2021.
These are some obstacles the author faced in their data science learning journey in the past year, including how much time it took to overcome each obstacle and what it has taught the author.
Tags: Career Advice, Challenges, Data Science, Data Science Education
- The Four Jobs of the Data Scientist
, by Roger Peng - Jan 13, 2021.
So, what do you do for a living? Sometimes, the answer to that question can feel like, "everything!" Well, for the Data Scientist, an extreme sense of being a "jack of all trades" is common. In fact, four such trades can be defined that a top-quality Data Scientist will iterate through during any one project.
Tags: Data Science, Data Science Skills, Data Scientist, Statistician
- The Best Tool for Data Blending is KNIME
, by Dennis Ganzaroli - Jan 13, 2021.
These are the lessons and best practices I learned in many years of experience in data blending, and the software that became my most important tool in my day-to-day work.
Tags: Data Exploration, Data Management, ETL, Knime
- Creating Good Meaningful Plots: Some Principles
- Jan 12, 2021.
Hera are some thought starters to help you create meaningful plots.
Tags: Charts, Data Visualization, Python, R
- Working With Sparse Features In Machine Learning Models
- Jan 12, 2021.
Sparse features can cause problems like overfitting and suboptimal results in learning models, and understanding why this happens is crucial when developing models. Multiple methods, including dimensionality reduction, are available to overcome issues due to sparse features.
Tags: Data Preparation, Feature Engineering, Machine Learning, Overfitting, Sparse data
- Cloud Data Warehouse is The Future of Data Storage
, by Nitin Kumar - Jan 12, 2021.
Today, cloud data storage accounts for 45% of all enterprise data and by Q2 2021, that number could grow to 53%. Now is the time to embrace cloud than now.
Tags: Cloud, Data Warehouse, Data Warehousing
- Top Stories, Jan 04-10: Best Python IDEs and Code Editors You Should Know; All Machine Learning Algorithms You Should Know in 2021
- Jan 11, 2021.
Also: DeepMind’s MuZero is One of the Most Important Deep Learning Systems Ever Created; 10 Underappreciated Python Packages for Machine Learning Practitioners; Six Tips on Building a Data Science Team at a Small Company
Tags: Top stories
- 5 Tools for Effortless Data Science
, by Nicole Janeway Bills - Jan 11, 2021.
The sixth tool is coffee.
Tags: Data Science, Data Science Tools, Keras, Machine Learning, MLflow, PyCaret, Python
- Attention mechanism in Deep Learning, Explained
, by Nagesh Chauhan - Jan 11, 2021.
Attention is a powerful mechanism developed to enhance the performance of the Encoder-Decoder architecture on neural network-based machine translation tasks. Learn more about how this process works and how to implement the approach into your work.
Tags: Attention, Deep Learning, Explained, LSTM, Machine Translation
- OpenAI Releases Two Transformer Models that Magically Link Language and Computer Vision
, by Jesus Rodriguez - Jan 11, 2021.
OpenAI has released two new transformer architectures that combine image and language tasks in an fun and almost magical way. Read more about them here.
Tags: Computer Vision, NLP, OpenAI, Transformer
- JupyterLab 3 is Here: Key reasons to upgrade now
, by Matthew Mayo - Jan 8, 2021.
Read about these 3 reasons for checking out JupyterLab 3 today.
Tags: Data Science, IDE, Jupyter, Programming
-
Best Python IDEs and Code Editors You Should Know
, by Claire D. Costa - Jan 8, 2021.
Developing machine learning algorithms requires implementing countless libraries and integrating many supporting tools and software packages. All this magic must be written by you in yet another tool -- the IDE -- that is fundamental to all your code work and can drive your productivity. These top Python IDEs and code editors are among the best tools available for you to consider, and are reviewed with their noteworthy features.
Tags: IDE, Jupyter, PyCharm, Python, Visual Studio Code
- Top 10 Computer Vision Papers 2020
, by Louis (What’s AI) Bouchard - Jan 8, 2021.
The top 10 computer vision papers in 2020 with video demos, articles, code, and paper reference.
Tags: AI, Computer Vision, Research
- Top December Stories: Why the Future of ETL Is Not ELT, But EL(T); 20 Core Data Science Concepts for Beginners
- Jan 7, 2021.
Also: A Rising Library Beating Pandas in Performance; 15 Free Data Science, Machine Learning & Statistics eBooks for 2021
Tags: Top stories
- 11 Industrial AI Trends that will Dominate the World in 2021
, by Swati Giri - Jan 7, 2021.
These trends broadly cover the three themes of: Where will businesses adopt AI in 2021? How will AI become more accessible? How will AI capabilities evolve?
Tags: 2021 Predictions, AI, Trends
- Advice to aspiring Data Scientists – your most common questions answered
, by Roman Orac - Jan 7, 2021.
Embarking on a new career path can be daunting with many unknowns about how to get started and how to be successful. If you are aspiring to become a Data Scientist, then the answers to these common questions can help set you off on the right foot.
Tags: Advice, Career Advice, Data Scientist, Mathematics, Online Education, SQL
- 10 Underappreciated Python Packages for Machine Learning Practitioners
, by Vinay Uday Prabhu - Jan 7, 2021.
Here are 10 underappreciated Python packages covering neural architecture design, calibration, UI creation and dissemination.
Tags: Deployment, Neural Networks, Python, UI/UX
- CatalyzeX: A must-have browser extension for machine learning engineers and researchers
, by Himanshu Ragtah - Jan 6, 2021.
CatalyzeX is a free browser extension that finds code implementations for ML/AI papers anywhere on the internet (Google, Arxiv, Twitter, Scholar, and other sites).
Tags: Implementation, Machine Learning, Programming, Research
-
Learn Data Science for free in 2021
, by Ahmad Anis - Jan 6, 2021.
If you are considering starting a career path in machine learning and data science, then there is a great deal to learn theoretically, along with gaining practical skills in applying a broad range of techniques. This comprehensive learning plan will guide you to start on this path, and it is all available for free.
Tags: Data Science Education, Online Education
- MLOps: Model Monitoring 101
, by Saha & Bose - Jan 6, 2021.
Model monitoring using a model metric stack is essential to put a feedback loop from a deployed ML model back to the model building stage so that ML models can constantly improve themselves under different scenarios.
Tags: AI, Data Science, DevOps, Machine Learning, MLOps, Modeling
- Where is Marketing Data Science Headed?
, by Kevin Gray - Jan 5, 2021.
Marketing data science - data science related to marketing - is now a significant part of marketing. Some of it directly competes with traditional marketing research and many marketing researchers may wonder what the future holds in store for it.
Tags: Analytics, Data Science, Marketing
- How to Get a Job as a Data Engineer
, by Anna Anisienia - Jan 5, 2021.
Data engineering skills are currently in high demand. If you are looking for career prospects in this fast-growing profession, then these 10 skills and key factors will help you prepare to land an entry-level position in this field.
Tags: Career Advice, Data Engineer, Data Engineering
- Model Experiments, Tracking and Registration using MLflow on Databricks
, by Dash Desai - Jan 5, 2021.
This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.
Tags: Data Science, Databricks, DataOps, Experimentation, MLflow, MLOps, Modeling
- DeepMind’s MuZero is One of the Most Important Deep Learning Systems Ever Created
, by Jesus Rodriguez - Jan 4, 2021.
MuZero takes a unique approach to solve the problem of planning in deep learning models.
Tags: AlphaZero, Deep Learning, DeepMind, MuZero, Reinforcement Learning
- Top Stories, Dec 21 – Jan 03: Monte Carlo integration in Python; 15 Free Data Science, Machine Learning & Statistics eBooks for 2021
- Jan 4, 2021.
Also: SQL vs NoSQL: 7 Key Takeaways; Generating Beautiful Neural Network Visualizations; Meet whale! The stupidly simple data discovery tool; Key Data Science Algorithms Explained: From k-means to k-medoids clustering
Tags: Top stories
-
All Machine Learning Algorithms You Should Know in 2021
, by Terence Shin - Jan 4, 2021.
Many machine learning algorithms exits that range from simple to complex in their approach, and together provide a powerful library of tools for analyzing and predicting patterns from data. If you are learning for the first time or reviewing techniques, then these intuitive explanations of the most popular machine learning models will help you kick off the new year with confidence.
Tags: Algorithms, Decision Tree, Explained, Gradient Boosting, K-nearest neighbors, Machine Learning, Naive Bayes, Regression, SVM
- Six Tips on Building a Data Science Team at a Small Company
, by Zbar & Vallejo - Jan 4, 2021.
When a company decides that they want to start leveraging their data for the first time, it can be a daunting task. Many businesses aren’t fully aware of all that goes into building a data science department. If you're the data scientist hired to make this happen, we have some tips to help you face the task head-on.
Tags: Data Science, Data Science Team, Data Scientist