Blog / News
- How Reading Papers Helps You Be a More Effective Data Scientist [Tuto] - Feb 24, 2021.
By reading papers, we were able to learn what others (e.g., LinkedIn) have found to work (and not work). We can then adapt their approach and not have to reinvent the rocket. This helps us deliver a working solution with lesser time and effort.
- Why Do Machine Learning Projects Fail?, by Rahul Agarwal [Opin] - Feb 24, 2021.
At the beginning of any data science project, many challenges could arise that lead to its eventual collapse. Making sure you look ahead -- early in the planning -- toward putting your resulting model into production can help increase the chance of delivering long-term value with your developed machine learning system.
- Pandas Profiling: One-Line Magical Code for EDA, by Juhi Sharma [Tuto] - Feb 24, 2021.
EDA can be automated using a Python library called Pandas Profiling. Let’s explore Pandas profiling to do EDA in a very short time and with just a single line code.
- KDnuggets™ News 21:n08, Feb 24: Powerful Exploratory Data Analysis in just two lines of code; Cartoon: Data Scientist vs Data Engineer - Feb 24, 2021.
Powerful Exploratory Data Analysis in just two lines of code; Cartoon: Data Scientist vs Data Engineer; Evaluating Deep Learning Models: The Confusion Matrix, Accuracy, Precision, and Recall; Feature Store as a Foundation for Machine Learning; Approaching (Almost) Any Machine Learning Problem
- Using NLP to improve your Resume, by David Moore [Tuto] - Feb 23, 2021.
This article discusses performing keyword matching and text analysis on job descriptions.
- 10 Statistical Concepts You Should Know For Data Science Interviews, by Terence Shin [Tuto] - Feb 23, 2021.
Data Science is founded on time-honored concepts from statistics and probability theory. Having a strong understanding of the ten ideas and techniques highlighted here is key to your career in the field, and also a favorite topic for concept checks during interviews.
- Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL, by Barr Moses [Tuto] - Feb 23, 2021.
Using schema and lineage to understand the root cause of your data anomalies.
- Top Stories, Feb 15-21: We Don’t Need Data Scientists, We Need Data Engineers [Top ] - Feb 22, 2021.
Also: Telling a Great Data Story: A Visualization Decision Tree; Cartoon: Data Scientist vs Data Engineer; Data Science vs Business Intelligence, Explained; Approaching (Almost) Any Machine Learning Problem
- An overview of synthetic data types and generation methods, by Devaux & Wehmeyer [Tuto] - Feb 22, 2021.
Synthetic data can be used to test new products and services, validate models, or test performances because it mimics the statistical property of production data. Today you'll find different types of structured and unstructured synthetic data.
- Powerful Exploratory Data Analysis in just two lines of code, by Francois Bertrand [Tuto] - Feb 22, 2021.
EDA is a fundamental early process for any Data Science investigation. Typical approaches for visualization and exploration are powerful, but can be cumbersome for getting to the heart of your data. Now, you can get to know your data much faster with only a few lines of code... and it might even be fun!
- Inside the Architecture Powering Data Quality Management at Uber, by Jesus Rodriguez [Tuto] - Feb 22, 2021.
Data Quality Monitor implements novel statistical methods for anomaly detection and quality management in large data infrastructures.
- Cartoon: Data Scientist vs Data Engineer, by Gregory Piatetsky [Opin] - Feb 20, 2021.
New KDnuggets Cartoon examines the problems of Data Scientists vs Data Engineers.
- People Skills for Analytical Thinkers, by Gilbert Eijkelenboom [Prod] - Feb 19, 2021.
Research shows that people skills are becoming more important with the rise of AI. A great way to boost these skills is by reading the new book: People Skills for Analytical Thinkers.
- Evaluating Deep Learning Models: The Confusion Matrix, Accuracy, Precision, and Recall, by Ahmed Gad [Tuto] - Feb 19, 2021.
This tutorial discusses the confusion matrix, and how the precision, recall and accuracy are calculated, and how they relate to evaluating deep learning models.
- Feature Store as a Foundation for Machine Learning, by German Osin [Tuto] - Feb 19, 2021.
With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.
- Multidimensional multi-sensor time-series data analysis framework, by Ajay Arunachalam [Tuto] - Feb 19, 2021.
This blog post provides an overview of the package “msda” useful for time-series sensor data analysis. A quick introduction about time-series data is also provided.
- Approaching (Almost) Any Machine Learning Problem, by Matthew Mayo [Tuto] - Feb 18, 2021.
This freely-available book is a fantastic walkthrough of practical approaches to machine learning problems.
- 6 Data Science Certificates To Level Up Your Career, by Sara Metwalli [Tuto] - Feb 18, 2021.
Anyone looking to obtain a data science certificate to prove their ability in the field will find a range of options exist. We review several valuable certificates to consider that will definitely pump up your resume and portfolio to get you closer to your dream job.
- Forecasting Stories 5: The story of the launch, by Rajneet Kaur [Tuto] - Feb 18, 2021.
New products forecasting can be very difficult - there is no history to start with, and hence no base line. The number of assumptions can be huge. The best way to forecast then, is to try parallel approaches, build different views and triangulate on a common range.
- Distributed and Scalable Machine Learning [Webinar], by Coiled.io [Prod] - Feb 17, 2021.
Mike McCarty and Gil Forsyth work at the Capital One Center for Machine Learning, where they are building internal PyData libraries that scale with Dask and RAPIDS. For this webinar, Feb 23 @ 2 pm PST, 5pm EST, they’ll join Hugo Bowne-Anderson and Matthew Rocklin to discuss their journey to scale data science and machine learning in Python.
- GPT-2 vs GPT-3: The OpenAI Showdown, by Kevin Vu [Tuto] - Feb 17, 2021.
Thanks to the diversity of the dataset used in the training process, we can obtain adequate text generation for text from a variety of domains. GPT-2 is 10x the parameters and 10x the data of its predecessor GPT.
- 10 resources for data science self-study, by Benjamin Obi Tayo [Tuto] - Feb 17, 2021.
Many resources exist for the self-study of data science. In our modern age of information technology, an enormous amount of free learning resources are available to anyone, and with effort and dedication, you can master the fundamentals of data science.
- Deep Learning-based Real-time Video Processing, by Serhii Maksymenko [Tuto] - Feb 17, 2021.
In this article, we explore how to build a pipeline and process real-time video with Deep Learning to apply this approach to business use cases overviewed in our research.
- KDnuggets™ News 21:n07, Feb 17: We Don’t Need Data Scientists, We Need Data Engineers; Data Science vs Business Intelligence, Explained - Feb 17, 2021.
Do we need more data engineers than data scientists? Data Science vs Business Intelligence, Explained; Telling a Great Data Story: A Visualization Decision Tree; Essential Math for Data Science: Scalars and Vectors; 7 most recommended skills for a Data Scientist.
- Machine Learning for Cybersecurity Certificate at U. of Chicago, by U. of Chicago [Prod] - Feb 16, 2021.
Hands-On Machine Learning Training from UChicago: 5-week remote Machine Learning for Cybersecurity certificate, Mar 30 - Apr 27. Learn from & network with leading faculty/industry leaders, learn data-driven prevention strategies. Group discounts, tuition support.
- Data Observability: Building Data Quality Monitors Using SQL, by Kearns & Moses [Tuto] - Feb 16, 2021.
To trigger an alert when data breaks, data teams can leverage a tried and true tactic from our friends in software engineering: monitoring and observability. In this article, we walk through how you can create your own data quality monitors for freshness and distribution from scratch using SQL.
- Hugging Face Transformers Package – What Is It and How To Use It, by Nagesh Chauhan [Tuto] - Feb 16, 2021.
The rapid development of Transformers have brought a new wave of powerful tools to natural language processing. These models are large and very expensive to train, so pre-trained versions are shared and leveraged by researchers and practitioners. Hugging Face offers a wide variety of pre-trained transformers as open-source libraries, and you can incorporate these with only one line of code.
- Easy, Open-Source AutoML in Python with EvalML, by Dylan Sherry [Tuto] - Feb 16, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
- IBM Uses Continual Learning to Avoid The Amnesia Problem in Neural Networks, by Jesus Rodriguez [Tuto] - Feb 15, 2021.
Using continual learning might avoid the famous catastrophic forgetting problem in neural networks.
- We Don’t Need Data Scientists, We Need Data Engineers, by Mihail Eric [Opin] - Feb 15, 2021.
As more people are entering the field of Data Science and more companies are hiring for data-centric roles, what type of jobs are currently in highest demand? There is so much data in the world, and it just keeps flooding in, it now looks like companies are targeting those who can engineer that data more than those who can only model the data.
- Top Stories, Feb 08-14: How to create stunning visualizations using python from scratch; Data Science vs Business Intelligence, Explained [Top ] - Feb 15, 2021.
Also: The Best Data Science Project to Have in Your Portfolio; How to Get Your First Job in Data Science without Any Work Experience; How to Get Data Science Interviews: Finding Jobs, Reaching Gatekeepers, and Getting Referrals
- Telling a Great Data Story: A Visualization Decision Tree, by Stan Pugsley [Tuto] - Feb 15, 2021.
Pick your visualizations strategically. They need to tell a story.
- Essential Math for Data Science: Scalars and Vectors, by Hadrien Jean [Tuto] - Feb 12, 2021.
Linear algebra is the branch of mathematics that studies vector spaces. You’ll see how vectors constitute vector spaces and how linear algebra applies linear transformations to these spaces. You’ll also learn the powerful relationship between sets of linear equations and vector equations.
- 6 NLP Techniques Every Data Scientist Should Know, by Sara Metwalli [Tuto] - Feb 12, 2021.
Natural language processing has already begun to transform to way humans interact with computers, and its advances are moving rapidly. The field is built on core methods that must first be understood, with which you can then launch your data science projects to a new level of sophistication and value.
- Understanding NoSQL Database Types: Column-Oriented Databases, by Alex Williams [Tuto] - Feb 12, 2021.
NoSQL Databases have four distinct types. Key-value stores, document-stores, graph databases, and column-oriented databases. In this article, we’ll explore column-oriented databases, also known simply as “NoSQL columns”.
- Online MS in Data Science from Northwestern, by Northwestern [Prod] - Feb 11, 2021.
Advance your data science career with Northwestern. Build the essential technical, analytical, and leadership skills needed for careers in today's data-driven world in Northwestern's Master of Science in Data Science program. Apply now.
- How to Speed up Scikit-Learn Model Training, by Michael Galarnyk [Tuto] - Feb 11, 2021.
Scikit-Learn is an easy to use a Python library for machine learning. However, sometimes scikit-learn models can take a long time to train. The question becomes, how do you create the best scikit-learn model in the least amount of time?
- Machine Learning – it’s all about assumptions, by Vishal Mendekar [Opin] - Feb 11, 2021.
Just as with most things in life, assumptions can directly lead to success or failure. Similarly in machine learning, appreciating the assumed logic behind machine learning techniques will guide you toward applying the best tool for the data.
- A Critical Comparison of Machine Learning Platforms in an Evolving Market, by Vivek Jain [Tuto] - Feb 11, 2021.
There’s a clear inclination towards the MLaaS model across industries, given the fact that companies today have an option to select from a wide range of solutions that can cater to diverse business needs. Here is a look at 3 of the top ML platforms for data excellence.
- Top January Stories: How I Got 4 Data Science Offers and Doubled My Income 2 Months After Being Laid Off; Best Python IDEs and Code Editors You Should Know, by Gregory Piatetsky [Top ] - Feb 10, 2021.
Also: All Machine Learning Algorithms You Should Know in 2021; DeepMind's MuZero is One of the Most Important Deep Learning Systems Ever Created
- Explore Molecular Engineering at UChicago, by U. of Chicago [Prod] - Feb 10, 2021.
Today’s engineers need to be equipped with the tools to take on leadership positions across industries. The new master’s program at the University of Chicago’s Pritzker School of Molecular Engineering will provide you with a streamlined and flexible degree to give you broad exposure across science and engineering disciplines, while preparing you for the immediate next step in your professional journey.
- My machine learning model does not learn. What should I do?, by Silipo & Arenas [Tuto] - Feb 10, 2021.
This article presents 7 hints on how to get out of the quicksand.
- 7 Most Recommended Skills to Learn to be a Data Scientist, by Terence Shin [Tuto] - Feb 10, 2021.
The Data Scientist professional has emerged as a true interdisciplinary role that spans a variety of skills, theoretical and practical. For the core, day-to-day activities, many critical requirements that enable the delivery of real business value reach well outside the realm of machine learning, and should be mastered by those aspiring to the field.
- Data Science vs Business Intelligence, Explained, by Stan Pugsley [Tuto] - Feb 10, 2021.
Knowing the differences between the business intelligence and data science is more than just a matter of semantics.
- KDnuggets™ News 21:n06, Feb 10: The Best Data Science Project to Have in Your Portfolio; Deep learning doesn’t need to be a black box - Feb 10, 2021.
The Best Data Science Project to Have in Your Portfolio; Deep learning doesn’t need to be a black box; Build Your First Data Science Application; How to create stunning visualizations using python from scratch; How to Get Your First Job in Data Science without Any Work Experience
- A Solid Investment: Banking on Talent Development, by SAS [Prod] - Feb 9, 2021.
The demand for analytics skills and talent has never been higher. As the workforce continues to evolve, so do the technology and skillsets required. Millennium Bank has partnered with SAS to customize a tailored development training program that improved skills and knowledge, while strengthening retention.
- How to Deploy a Flask API in Kubernetes and Connect it with Other Micro-services, by Rik Kraan [Tuto] - Feb 9, 2021.
A hands-on tutorial on how to implement your micro-service architecture using the powerful container orchestration tool Kubernetes.
- Who is fit to lead data science?, by Polly Mitchell-Guthrie [Opin] - Feb 9, 2021.
Data science success depends on leaders, not the latest hands-on programming skills. So, we need to start looking for the right leadership skills and stop stuffing job postings with requirements for experience in the most current development tools.
- Adversarial Attacks on Explainable AI, by Hubert Baniecki [Tuto] - Feb 9, 2021.
Are explainability methods black-box themselves?
- Microsoft Explores Three Key Mysteries of Ensemble Learning, by Jesus Rodriguez [Tuto] - Feb 8, 2021.
A new paper studies three key puzzling characteristics of deep learning ensembles and some potential explanations.
- How to Get Data Science Interviews: Finding Jobs, Reaching Gatekeepers, and Getting Referrals, by Emma Ding [Opin] - Feb 8, 2021.
In this post, the author shares what to do to get job interviews efficiently. Find answers to these questions: Where should I look for data science jobs? How do I reach out to the gatekeeper? How do I get referrals? What makes a good data science resume?
- The Best Data Science Project to Have in Your Portfolio, by Soner Yıldırım [Opin] - Feb 8, 2021.
If you are trying to find your first path into a Data Science career, then demonstrating the quality of your skills can be the greatest hurdle. While many standard projects exist for anyone to complete, creating an original data-driven project that attempts to solve some challenge is worth so much more. A good Data Scientist is one that can solve data-related questions, and a great Data Scientist poses original data-related questions and then solves.
- Top Stories, Feb 1-7: How to create stunning visualizations using python from scratch; How to Get Your First Job in Data Science without Any Work Experience [Top ] - Feb 8, 2021.
Also: Build Your First Data Science Application; 3 Ways Understanding Bayes Theorem Will Improve Your Data Science; Deep learning doesn’t need to be a black box; Essential Math for Data Science: Introduction to Matrices and the Matrix Product
- Essential Math for Data Science: Introduction to Matrices and the Matrix Product, by Hadrien Jean [Tuto] - Feb 5, 2021.
As vectors, matrices are data structures allowing you to organize numbers. They are square or rectangular arrays containing values organized in two dimensions: as rows and columns. You can think of them as a spreadsheet. Learn more here.
- Deep learning doesn’t need to be a black box, by Ben Dickson [Tuto] - Feb 5, 2021.
The cultural perception of AI is often suspect because of the described challenges in knowing why a deep neural network makes its predictions. So, researchers try to crack open this "black box" after a network is trained to correlate results with inputs. But, what if the goal of explainability could be designed into the network's architecture -- before the model is trained and without reducing its predictive power? Maybe the box could stay open from the beginning.
- Backcasting: Building an Accurate Forecasting Model for Your Business, by Lena Boichuk [Tuto] - Feb 5, 2021.
This article will shed some light on processes happening under the roof of ML-based solutions on the example of the business case where the future success directly depends on the ability to predict unknown values from the past.
- Build Your First Data Science Application, by Naser Tamimi [Tuto] - Feb 4, 2021.
Check out these seven Python libraries to make your first data science MVP application.
- How to create stunning visualizations using python from scratch, by Sharan Kumar R [Tuto] - Feb 4, 2021.
Data science and data analytics can be beautiful things. Not only because of the insights and enhancements to decision-making they can provide, but because of the rich visualizations about the data that can be created. Following this step-by-step guide using the Matplotlib and Seaborn libraries will help you improve the presentation and effective communication of your work.
- 2011: DanNet triggers deep CNN revolution, by Jürgen Schmidhuber [Opin] - Feb 4, 2021.
In 2021, we are celebrating the 10-year anniversary of DanNet, which, in 2011, was the first pure deep convolutional neural network (CNN) to win computer vision contests. Read about its history here.
- Getting Started with 5 Essential Natural Language Processing Libraries, by Matthew Mayo [Tuto] - Feb 3, 2021.
This article is an overview of how to get started with 5 popular Python NLP libraries, from those for linguistic data visualization, to data preprocessing, to multi-task functionality, to state of the art language modeling, and beyond.
- Saving and loading models in TensorFlow — why it is important and how to do it, by Ahmad Anis [Tuto] - Feb 3, 2021.
So much time and effort can go into training your machine learning models. But, shut down the notebook or system, and all those trained weights and more vanish with the memory flush. Saving your models to maximize reusability is key for efficient productivity.
- How to Get Your First Job in Data Science without Any Work Experience, by Madison Hunter [Opin] - Feb 3, 2021.
Creativity, grit, and perseverance will become the three words you live by.
- KDnuggets™ News 21:n05, Feb 3: How to Get a Job as a Data Scientist; Popular Machine Learning Interview Questions, part 2 - Feb 3, 2021.
Learn how to get a job as Data Scientist; it will help if you study popular machine learning interview questions; Beyond the Nash Equilibrium: DeepMind Clever Strategy to Solve Asymmetric Games; Understanding Bayes Theorem; and more.
- Adversarial generation of extreme samples, by Lucy Smith [Tuto] - Feb 2, 2021.
In order to mitigate risks when modelling extreme events, it is vital to be able to generate a wide range of extreme, and realistic, scenarios. Researchers from the National University of Singapore and IIT Bombay have developed an approach to do just that.
- Does Data Science Make You Happy?, by Yulia Lukashina [Opin] - Feb 2, 2021.
Maybe you are embarking on a new learning journey into the world of data and its analysis, or you already launched your career in the field. But, how can you make sure that data science is your calling? Indeed, if you feel good in your job, then you are likely on the right path.
- Vision Transformers: Natural Language Processing (NLP) Increases Efficiency and Model Generality, by Kevin Vu [Tuto] - Feb 2, 2021.
Why do we hear so little about transformer models applied to computer vision tasks? What about attention in computer vision networks?
- Top Stories, Jan 25-31: Want to Be a Data Scientist? Don’t Start With Machine Learning; The Ultimate Scikit-Learn Machine Learning Cheatsheet [Top ] - Feb 1, 2021.
Also: How I Got 4 Data Science Offers and Doubled my Income 2 Months After Being Laid Off; How to Get a Job as a Data Scientist; Data Engineering — the Cousin of Data Science, is Troublesome; What to Learn to Become a Data Scientist in 2021
- Celebrate International Women’s Day at the Women in Data Science (WiDS) Worldwide Virtual Conference, by Women in Data Science [Prod] - Feb 1, 2021.
On March 8, 2021, Stanford will host the inaugural 24-hour virtual Women in Data Science (WiDS) Worldwide conference. Find out speaker and registration information here.
- 3 Ways Understanding Bayes Theorem Will Improve Your Data Science, by Nicole Janeway Bills [Tuto] - Feb 1, 2021.
Mastery of the mathematics and applications of this intuitive statistical concept will advance your credibility as a decision maker.
- One question to make your data project 10x more valuable, by Brittany Davis [Opin] - Feb 1, 2021.
If you are the "data person" for your organization, then providing meaningful results to stakeholder data requests can sometimes feel like shots in the dark. However, you can make sure your data analysis is actionable by asking one magic question before getting started.
- Beyond the Nash Equilibrium: DeepMind Clever Strategy to Solve Asymmetric Games, by Jesus Rodriguez [Tuto] - Feb 1, 2021.
The method expands the concept of a Nash equilibrium by decomposing an asymmetric game into multiple symmetric games.
- Baidu Research: 10 Technology Trends in 2021, by Baidu Research [Tuto] - Jan 29, 2021.
Understanding future technology trends may never have been as important as it is today. Check out the prediction of the 10 technology trends in 2021 from Baidu Research.
- Machine learning adversarial attacks are a ticking time bomb, by Ben Dickson [Opin] - Jan 29, 2021.
Software developers and cyber security experts have long fought the good fight against vulnerabilities in code to defend against hackers. A new, subtle approach to maliciously targeting machine learning models has been a recent hot topic in research, but its statistical nature makes it difficult to find and patch these so-called adversarial attacks. Such threats in the real-world are becoming imminent as the adoption of machine learning spreads, and a systematic defense must be implemented.
- What is Graph Theory, and Why Should You Care?, by Vegard Flovik [Tuto] - Jan 29, 2021.
Go from graph theory to path optimization.
- Top 5 Reasons Why Machine Learning Projects Fail, by Sudeep Srivastava [Tuto] - Jan 28, 2021.
The rise in machine learning project implementation is coming, as is the the number of failures, due to several implementation and maintenance challenges. The first step of closing this gap lies in understanding the reasons for the failure.
- Machine learning is going real-time, by Chip Huyen [Opin] - Jan 28, 2021.
Extracting immediate predictions from machine learning algorithms on the spot based on brand-new data can offer a next level of interaction and potential value to its consumers. The infrastructure and tech stack required to implement such real-time systems is also next level, and many organizations -- especially in the US -- seem to be resisting. But, what even is real-time ML, and how can it deliver a better experience?
- Working With The Lambda Layer in Keras, by Ahmed Gad [Tuto] - Jan 28, 2021.
In this tutorial we'll cover how to use the Lambda layer in Keras to build, save, and load models which perform custom operations on your data.
- How to Get a Job as a Data Scientist, by Devin Partida [Opin] - Jan 27, 2021.
Here’s a step-by-step guide to starting your career in data science.
- Popular Machine Learning Interview Questions, part 2, by Mo Daoud [Tuto] - Jan 27, 2021.
Get ready for your next job interview requiring domain knowledge in machine learning with answers to these thirteen common questions.
- Support Vector Machine for Hand Written Alphabet Recognition in R, by Mohan Rai [Tuto] - Jan 27, 2021.
We attempt to break down a problem of hand written alphabet image recognition into a simple process rather than using heavy packages. This is an attempt to create the data and then build a model using Support Vector Machines for Classification.
- KDnuggets™ News 21:n04, Jan 27: The Ultimate Scikit-Learn Machine Learning Cheatsheet; Building a Deep Learning Based Reverse Image Search - Jan 27, 2021.
The Ultimate Scikit-Learn Machine Learning Cheatsheet; Building a Deep Learning Based Reverse Image Search; Data Engineering — the Cousin of Data Science, is Troublesome; Going Beyond the Repo: GitHub for Career Growth in AI & Machine Learning; Popular Machine Learning Interview Questions
- Is M.Tech in Data Science Worth It?, by Great Learning [News] - Jan 26, 2021.
Is M.Tech in Data Science worth it or should you learn using just online courses and projects. Let's try to find the answer to that question.
- What to Learn to Become a Data Scientist in 2021, by Andrea Laura [Opin] - Jan 26, 2021.
As data becomes the new ‘Gold’ for businesses, data scientists are set to find their value in this gold. This write-up clearly defines the job requirements and company expectations that this phenomenally evolving role entails.
- Want to Be a Data Scientist? Don’t Start With Machine Learning, by Terence Shin [Opin] - Jan 26, 2021.
Machine learning may appear like the go-to topic to start learning for the aspiring data scientist. But. thinking these techniques are the key aspects of the role is the biggest misconception. So much more goes into becoming a successful data scientist, and machine learning is only one component of broader skills around processing, managing, and understanding the science behind the data.
- Deep Learning Pioneer Geoff Hinton on his Latest Research and the Future of AI, by Craig Smith [Opin] - Jan 26, 2021.
Geoff Hinton has lived at the outer reaches of machine learning research since an aborted attempt at a carpentry career a half century ago. He spoke to Craig Smith about his work In 2020 and what he sees on the horizon for AI.
- Six Times Bigger than GPT-3: Inside Google’s TRILLION Parameter Switch Transformer Model, by Jesus Rodriguez [Tuto] - Jan 25, 2021.
Google’s Switch Transformer model could be the next breakthrough in this area of deep learning.
- The Ultimate Scikit-Learn Machine Learning Cheatsheet, by Andre Ye [Tuto] - Jan 25, 2021.
With the power and popularity of the scikit-learn for machine learning in Python, this library is a foundation to any practitioner's toolset. Preview its core methods with this review of predictive modelling, clustering, dimensionality reduction, feature importance, and data transformation.
- Top Stories, Jan 18-24: How I Got 4 Data Science Offers and Doubled my Income 2 Months After Being Laid Off; Cloud Computing, Data Science and ML Trends in 2020–2022: The battle of giants [Top ] - Jan 25, 2021.
Also: Data Engineering — the Cousin of Data Science, is Troublesome; Build a Data Science Portfolio that Stands Out Using These Platforms; K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines; Popular Machine Learning Interview Questions
- Null Hypothesis Significance Testing is Still Useful, by Nicole Janeway Bills [Opin] - Jan 25, 2021.
Even in the aftermath of the replication crisis, statistical significance lingers as an important concept for Data Scientists to understand.
- Building a Deep Learning Based Reverse Image Search, by Vegard Flovik [Tuto] - Jan 22, 2021.
Following the journey from unstructured data to content based image retrieval.