2020 Jan
All (91) | Events (7) | News, Education (5) | Opinions (32) | Top Stories, Tweets (9) | Tutorials, Overviews (38)
- Decision Tree Algorithm, Explained, by Nagesh Singh Chauhan - Feb 9, 2022.
All you need to know about decision trees and how to build and optimize decision tree classifier.
-
Data Validation for Machine Learning - Jan 31, 2020.
While the validation process cannot directly find what is wrong, the process can show us sometimes that there is a problem with the stability of the model. -
How to land a Data Scientist job at your dream company - Jan 31, 2020.
Job hunting for anyone just starting out as a data scientist can require grit, passion, and perseverance before finding the best opportunity. Follow this career search journey to learn what it took -- and the learning resources used -- to land the dream job. - OpenAI is Adopting PyTorch… They Aren’t Alone - Jan 31, 2020.
OpenAI is moving to PyTorch for the bulk of their research work. This might be a high-profile adoption, but it is far from the only such example.
- Geek & Chic: Analytics redefining fashion instincts - Jan 31, 2020.
To effectively start with fashion retail analytics, players in the fashion retail sector need to first decide where analytics will help them achieve the greatest business impact.
-
Top 10 AI, Machine Learning Research Articles to know - Jan 30, 2020.
We’ve seen many predictions for what new advances are expected in the field of AI and machine learning. Here, we review a “data set” based on what researchers were apparently studying at the turn of the decade to take a fresh glimpse into what might come to pass in 2020. -
How to Optimize Your Jupyter Notebook - Jan 30, 2020.
This article walks through some simple tricks on improving your Jupyter Notebook experience, and covers useful shortcuts, adding themes, automatically generated table of contents, and more. - Amazon Gets Into the AutoML Race with AutoGluon: Some AutoML Architectures You Should Know About - Jan 30, 2020.
Amazon, Microsoft, Salesforce, Waymo have produced some of the most innovative AutoML architectures in the market.
- Top KDnuggets tweets, Jan 22-28: The 5 Most Useful Techniques to Handle Imbalanced Datasets - Jan 29, 2020.
Also: Global #AI Index ranks 54 countries: US is currently ahead. China is second, but growing faster; RStudio Projects and Working Directories: A Beginner's Guide #rstats; The Book to Start You on Machine Learning; Data Scientist Archetypes
- Top 25 Session Highlights at ODSC East 2020 - Jan 29, 2020.
ODSC East is back in Boston, Apr 13-17, 2020. Preliminary schedule is a unique collection of the leading experts and rising stars of data science. Register soon, as our 50% discount ends this Friday, Jan 31!
- Managing Machine Learning Cycles: Five Learnings from comparing Data Science Experimentation/ Collaboration Tools - Jan 29, 2020.
Machine learning projects require handling different versions of data, source code, hyperparameters, and environment configuration. Numerous tools are on the market for managing this variety, and this review features important lessons learned from an ongoing evaluation of the current landscape.
- Generating English Pronoun Questions Using Neural Coreference Resolution - Jan 29, 2020.
This post will introduce a practical method for generating English pronoun questions from any story or article. Learn how to take an additional step toward computationally understanding language.
- Google Dataset Search Provides Access to 25 Million Datasets - Jan 29, 2020.
Google's dataset search is out of beta, and provides centralized access to 25 million datasets.
- A bird’s-eye view of modern AI from NeurIPS 2019 - Jan 28, 2020.
With the explosion of the field of AI/ML impacting so many applications and industries, there is great value coming out of recent progress. This review highlights many research areas covered at the NeurIPS 2019 conference recently held in Vancouver, Canada, and features many important areas of progress we expect to see in the coming year.
- Data Scientist Archetypes - Jan 28, 2020.
My goal here is to give you a map for navigating the sprawling terrain of data science. It’s to help you prioritize what you want to learn and what you want to do, so you don’t feel lost.
- Exoplanet Hunting Using Machine Learning - Jan 28, 2020.
Search for exoplanets — those planets beyond our own solar system — using machine learning, and implement these searches in Python.
- AutoML Poll results: if you try it, you’ll like it more - Jan 27, 2020.
The results of latest KDnuggets Poll on AutoML are quite interesting. While most respondents were not happy with AutoML performance, the opinions of those who tried it were higher than those who did not.
- Agenda Sneak Peek for PAW for Industry 4.0 | Munich, 11-12 May - Jan 27, 2020.
Predictive Analytics World for Industry 4.0 is coming closer and closer. Take advantage of the Early Bird price until Feb 14! Use code KDNUGGETS for a 15% discount on your Predictive Analytics World ticket.
- The Decade of Data Science - Jan 27, 2020.
With the last decade being so strong for the emerging field of Data Science, this review considers current trends in the industry, popular frameworks, helpful tools, and new tools that can be leveraged more in the future.
- Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in the Market - Jan 27, 2020.
Many of the technologies used by Uber teams have been open sourced and received accolades from the machine learning community. Let’s look at some of my favorites.
- Top Stories, Jan 20-26: I wanna be a data scientist, but how? - Jan 27, 2020.
Also: Microsoft Introduces Project Petridish to Find the Best Neural Network for Your Problem; 10 Python String Processing Tips & Tricks; Random Forest — A Powerful Ensemble Learning Algorithm; Top 10 Technology Trends for 2020; The Book to Start You on Machine Learning
- You’re Fired: How to develop and manage a happy data science team - Jan 27, 2020.
I want to share a solution called Insight-Driven Development (IDD), a few examples of it, and five steps to adopting it. IDD aims to create a high performing, engaged, and happy Data Science teams that embrace non-ML work as much as the fun ML stuff.
- 2 Questions for a Junior Data Scientist - Jan 24, 2020.
Academic credentials and experience with previous machine learning projects are important for kicking off a data science career. However, landing your first job out of school will require you to extend your thinking about projects and problems. Learn how one interviewer honed in on desired skills by considering these two questions.
- Semi-supervised learning with Generative Adversarial Networks - Jan 24, 2020.
The paper discussed in this post, Semi-supervised learning with Generative Adversarial Networks, utilizes a GAN architecture for multi-label classification.
- Top 7 Location Intelligence Companies in 2020 - Jan 24, 2020.
Here’s a complete list of top 7 location intelligence companies in the market - an overview, pricing, pros, and cons that’ll help you identify the right location intelligence tool for your business.
- PAW Healthcare Sneak Peek Agenda | Munich, 11-12 May - Jan 23, 2020.
Visit Deep Learning World, 11-12 May in Munich, to broaden your knowledge, deepen your understanding and discuss your questions with other Deep Learning experts!
- How to Get Started With Algorithmic Finance - Jan 23, 2020.
Algorithmic finance has been around for decades as a money-making tool, and it's not magic. Learn about some practical strategies along with and introduction to code you can use to get started.
- What Do Data Scientists in Europe Do & How Much Are They Worth? - Jan 23, 2020.
Interested in knowing what a data scientist is worth in Europe, and what one does? Read this overview of a recent survey on the topic and gain some insight into the European data science professional job market.
- NLP Year in Review — 2019 - Jan 23, 2020.
In this blog post, I want to highlight some of the most important stories related to machine learning and NLP that I came across in 2019.
- Top KDnuggets tweets, Jan 15-21: My Pandas Cheat Sheet; 5 Key Reasons Why Data Scientists Are Quitting their Jobs - Jan 22, 2020.
5 Key Reasons Why Data Scientists Are Quitting their Jobs; My Pandas Cheat Sheet; Google Colab: Jupyter Lab on steroids (perfect for Deep Learning); Top 5 Must-have Data Science Skills.
- Big Data. Big Impact - Jan 22, 2020.
Ramapo College’s Master of Science in Data Science program will teach you to collect, synthesize, and analyze big data, become skilled in programming languages like R and Python, and leverage advanced tools to meet the demands of modern business and science.
- The Data Science Interview Study Guide - Jan 22, 2020.
Preparing for a job interview can be a full-time job, and Data Science interviews are no different. Here are 121 resources that can help you study and quiz your way to landing your dream data science job.
- The 5 Most Useful Techniques to Handle Imbalanced Datasets, by Rahul Agarwal - Jan 22, 2020.
This post is about explaining the various techniques you can use to handle imbalanced datasets.
- Random Forest® — A Powerful Ensemble Learning Algorithm - Jan 22, 2020.
The article explains the Random Forest algorithm and how to build and optimize a Random Forest classifier.
- How to reinvent your career with computer science - Jan 21, 2020.
There is great news for anyone looking to make a career switch into computer science. So, what does it take to make the leap? Check out three top tips for budding computer scientists, as well as the Computer Science online MSc from the University of Bath.
- Top 5 AI trends for 2020 - Jan 21, 2020.
We are all witnessing a staggering growth of AI technology with so many new benefits for people while also changing the way we live and work. As AI continues to grow, which applications will have a significant impact in 2020?
- Artificial Intelligence Books to Read in 2020 - Jan 21, 2020.
Here are some AI-related books that I’ve read and recommend for you to add to your 2020 reading list!
- Explaining Black Box Models: Ensemble and Deep Learning Using LIME and SHAP - Jan 21, 2020.
This article will demonstrate explainability on the decisions made by LightGBM and Keras models in classifying a transaction for fraudulence, using two state of the art open source explainability techniques, LIME and SHAP.
- Learn from Industry Leaders Facebook, Walmart, Lyft, eBay & More - Jan 20, 2020.
The agenda for Deep Learning World 2020 in Las Vegas, May 31-Jun 4, has been released. Use code KDNUGGETS for a 15% discount on your ticket.
-
I wanna be a data scientist, but… how? - Jan 20, 2020.
It’s easy to say "I wanna be a data scientist," but... where do you start? How much time is needed to be desired by companies? Do you need a Master’s degree? Do you need to know every mathematical concept ever derived? The journey might be long, but follow this plan to help you keep moving forward toward your career goal. - Top Stories, Jan 13-19: Math for Programmers!; Decision Tree Algorithm, Explained - Jan 20, 2020.
Also: Top 9 Mobile Apps for Learning and Practicing Data Science; Classify A Rare Event Using 5 Machine Learning Algorithms; The Future of Machine Learning; The Book to Start You on Machine Learning
- We Created a Lazy AI - Jan 20, 2020.
This article is an overview of how to design and implement reinforcement learning for the real world.
-
Microsoft Introduces Project Petridish to Find the Best Neural Network for Your Problem - Jan 20, 2020.
The new algorithm takes a novel approach to neural architecture search. - 10 Python String Processing Tips & Tricks - Jan 20, 2020.
Pursuing a text analytics path but don't know where to start? Try this string processing primer to first gain an understanding of using Python to manipulate and process strings at a basic level.
- The Future of Machine Learning - Jan 17, 2020.
This summary overviews the keynote at TensorFlow World by Jeff Dean, Head of AI at Google, that considered the advancements of computer vision and language models and predicted the direction machine learning model building should follow for the future.
-
Top 9 Mobile Apps for Learning and Practicing Data Science - Jan 17, 2020.
This article will tell you about the top 9 mobile apps that help the user in learning and practicing data science and hence is improving their productivity. - Methods, challenges & applications of Deep Learning | Munich 11-12 May - Jan 16, 2020.
Visit Deep Learning World, 11-12 May in Munich, to broaden your knowledge, deepen your understanding and discuss your questions with other Deep Learning experts!
-
Top 10 Technology Trends for 2020 - Jan 16, 2020.
With integrations of multiple emerging technologies just in the past year, AI development continues at a fast pace. Following the blueprint of science and technology advancements in 2019, we predict 10 trends we expect to see in 2020 and beyond. - Schema Evolution in Data Lakes - Jan 16, 2020.
Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility. However, this flexibility is a double-edged sword.
- Handling Trees in Data Science Algorithmic Interview - Jan 16, 2020.
This post is about fast-tracking the study and explanation of tree concepts for the data scientists so that you breeze through the next time you get asked these in an interview.
- Top KDnuggets tweets, Jan 08-14: A Beginners Guide to Data Engineering — Part I - Jan 15, 2020.
Also: The Book to Start You on Machine Learning - KDnuggets; Top KDnuggets tweets, Jan 1-7: Introduction to #DataVisualization and Storytelling: A Guide For The #DataScientist #eBook; 7 Steps to a Job-winning Data Science Resume - KDnuggets; Tips for open-sourcing research code
-
Math for Programmers! - Jan 15, 2020.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer. - Disentangling disentanglement: Ideas from NeurIPS 2019 - Jan 15, 2020.
This year’s NEURIPS-2019 Vancouver conference recently concluded and featured a dozen papers on disentanglement in deep learning. What is this idea and why is it so interesting in machine learning? This summary of these papers will give you initial insight in disentanglement as well as ideas on what you can explore next.
- Classify A Rare Event Using 5 Machine Learning Algorithms - Jan 15, 2020.
Which algorithm works best for unbalanced data? Are there any tradeoffs?
- Geovisualization with Open Data - Jan 15, 2020.
In this post I want to show how to use public available (open) data to create geo visualizations in python. Maps are a great way to communicate and compare information when working with geolocation data. There are many frameworks to plot maps, here I focus on matplotlib and geopandas (and give a glimpse of mplleaflet).
- Survey Segmentation Tutorial - Jan 14, 2020.
Learn the basics of verifying segmentation, analyzing the data, and creating segments in this tutorial. When reviewing survey data, you will typically be handed Likert questions (e.g., on a scale of 1 to 5), and by using a few techniques, you can verify the quality of the survey and start grouping respondents into populations.
- 7 AI Use Cases Transforming Live Sports Production and Distribution - Jan 14, 2020.
Here are 7 powerful AI led use cases both for linear television and for OTT apps that are transforming the live sports production landscape.
- Statistical Thinking for Industrial Problem Solving: a free online course. - Jan 13, 2020.
This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.
- Idiot’s Guide to Precision, Recall, and Confusion Matrix - Jan 13, 2020.
Building Machine Learning models is fun, but making sure we build the best ones is what makes a difference. Follow this quick guide to appreciate how to effectively evaluate a classification model, especially for projects where accuracy alone is not enough.
- Top Stories, Jan 6-12: Top 5 must-have Data Science skills for 2020; 7 Resources to Becoming a Data Engineer - Jan 13, 2020.
Also: The Book to Start You on Machine Learning; An Introductory Guide to NLP for Data Scientists with 7 Common Techniques; A Comprehensive Guide to Natural Language Generation; The Book to Start You on Machine Learning; 10 Python Tips and Tricks You Should Learn Today
- Graph Machine Learning Meets UX: An uncharted love affair - Jan 13, 2020.
When machine learning tools are developed by technology first, they risk failing to deliver on what users actually need. It can also be difficult for development teams to establish meaningful direction. This article explores the challenges of designing an interface that enables users to visualise and interact with insights from graph machine learning, and explores the very new, uncharted relationship between machine learning and UX.
- Uber Creates Generative Teaching Networks to Better Train Deep Neural Networks - Jan 13, 2020.
The new technique can really improve how deep learning models are trained at scale.
- Top December Stories: What is a Data Scientist Worth? AI, ML, DS, DL Research Main Developments and Key Trends - Jan 10, 2020.
Also: Google's New Explainable AI Service; 10 Free Top Notch Machine Learning Courses.
- Applying Occam’s razor to Deep Learning - Jan 10, 2020.
Finding a deep learning model to perform well is an exciting feat. But, might there be other -- less complex -- models that perform just as well for your application? A simple complexity measure based on the statistical physics concept of Cascading Periodic Spectral Ergodicity (cPSE) can help us be computationally efficient by considering the least complex during model selection.
- Deepfakes Security Risks - Jan 10, 2020.
Deepfakes have instilled panic in experts since they first emerged in 2017. Microsoft and Facebook have recently announced a contest to identify deepfakes more efficiently.
- 7 Steps to a Job-winning Data Science Resume - Jan 10, 2020.
A resume plays a key role in bagging that dream data science job. We break down the nuances of a job-winning data science resume so that you can go ahead and transform your own resume.
- Fast Track Your Data Science Career - Jan 9, 2020.
Earn a Master of Professional Studies in Data Analytics online through Penn State World Campus – and you can add in-demand skills to your wheelhouse while you continue to work.
-
An Introductory Guide to NLP for Data Scientists with 7 Common Techniques - Jan 9, 2020.
Data Scientists work with tons of data, and many times that data includes natural language text. This guide reviews 7 common techniques with code examples to introduce you the essentials of NLP, so you can begin performing analysis and building models from textual data. -
The Book to Start You on Machine Learning - Jan 9, 2020.
This book is thought for beginners in Machine Learning, that are looking for a practical approach to learning by building projects and studying the different Machine Learning algorithms within a specific context. - Stock Market Forecasting Using Time Series Analysis, by Nagesh Singh Chauhan - Jan 9, 2020.
Time series analysis will be the best tool for forecasting the trend or even future. The trend chart will provide adequate guidance for the investor. So let us understand this concept in great detail and use a machine learning technique to forecast stocks.
- Top KDnuggets tweets, Jan 01-07: Introduction to Data Visualization and Storytelling: A Guide For The Data Scientist eBook - Jan 8, 2020.
Introduction to Data Visualization & Storytelling;The Data Science Interview Study Guide; Why Kaggle will NOT make you a great Data Scientist; Cartoon: Teaching Ethics to AI
-
Top 5 must-have Data Science skills for 2020 - Jan 8, 2020.
The standard job description for a Data Scientist has long highlighted skills in R, Python, SQL, and Machine Learning. With the field evolving, these core competencies are no longer enough to stay competitive in the job market. -
Learning SQL the Hard Way - Jan 8, 2020.
Simply put: This post is about installing SQL, explaining SQL and running SQL. -
10 Python Tips and Tricks You Should Learn Today - Jan 8, 2020.
Check out this collection of 10 Python snippets that can be taken as a reference for your daily work. - 5 Hands-on Skills Every Data Scientist Needs in 2020 – Coming to ODSC East - Jan 7, 2020.
Here are our top five hands-on training focus areas that every data scientist should know and that we’re paying extra attention to at ODSC East 2020 this April 13-17 in Boston.
-
7 Resources to Becoming a Data Engineer - Jan 7, 2020.
An estimated 8,650% growth of the volume of Data to 175 zetabytes from 2010 to 2025 has created an enormous need for Data Engineers to build an organization's big data platform to be fast, efficient and scalable. - A Comprehensive Guide to Natural Language Generation - Jan 7, 2020.
Follow this overview of Natural Language Generation covering its applications in theory and practice. The evolution of NLG architecture is also described from simple gap-filling to dynamic document creation along with a summary of the most popular NLG models.
- Introducing Generalized Integrated Gradients (GIG): A Practical Method for Explaining Diverse Ensemble Machine Learning Models - Jan 7, 2020.
There is a need for a new way to explain complex, ensembled ML models for high-stakes applications such as credit and lending. This is why we invented GIG.
- 5 Ways AI Is Changing The Healthcare Industry - Jan 7, 2020.
The healthcare AI market is expected to reach 28 billion dollars by the year 2025. With such exponential growth, AI is undoubtedly likely to bring some drastic changes in the healthcare industry. Let’s look at five ways of how AI has changed the healthcare industry.
- Live Webinar: Learn how to build better machine learning pipelines - Jan 6, 2020.
In this webinar, Jan 15 @ 12PM EST, we'll offer solutions to the common challenges data scientists and data engineers face when building a machine learning pipeline. Register now to attend live or to watch a recording afterwards.
- 3 common data science career transitions, and how to make them happen - Jan 6, 2020.
Breaking into a career in Data Science can depend on where you start. See if you fit into one of these three categories of "newbies," and then find out how to make your professional transition into the field.
- H2O Framework for Machine Learning - Jan 6, 2020.
This article is an overview of H2O, a scalable and fast open-source platform for machine learning. We will apply it to perform classification tasks.
- Top Stories, Dec 30 – Jan 5: How To Ultralearn Data Science; Automated Machine Learning: How do teams work together on an AutoML project? - Jan 6, 2020.
Also: Predict Electricity Consumption Using Time Series Analysis; What is the most important question for Data Science (and Digital Transformation); Why Python is One of the Most Preferred Languages for Data Science?; What is a Data Scientist Worth?; How to Speed up Pandas by 4x with one line of code
- How to Convert a Picture to Numbers - Jan 6, 2020.
Reducing images to numbers makes them amenable to computation. Let's take a look at the why and the how using Python and Numpy.
- Cartoon: Teaching Ethics to AI - Jan 4, 2020.
Ethics in AI has received significant attention recently, and the new KDnuggets cartoon examines the problem of teaching ethics to artificially intelligent entities.
- Why Python is One of the Most Preferred Languages for Data Science? - Jan 3, 2020.
Why do most data scientists love Python? Learn more about how so many well-developed Python packages can help you accomplish your crucial data science tasks.
- Beginner’s Guide to K-Nearest Neighbors in R: from Zero to Hero - Jan 3, 2020.
This post presents a pipeline of building a KNN model in R with various measurement metrics.
- How HR Is Using Data Science and Analytics to Close the Gender Gap - Jan 3, 2020.
The gender gap can extend to the lack of equal representation in certain industries or career paths, and there's an extraordinarily long way to go before people will be on equal footing in the labor market. Human resources professionals can rely on data analytics to make progress.
- Accuracy vs Speed – what Data Scientists can learn from Search - Jan 2, 2020.
Delivering accurate insights is the core function of any data scientist. Navigating the development road toward this goal can sometimes be tricky, especially when cross-collaboration is required, and these lessons learned from building a search application will help you negotiate the demands between accuracy and speed.
-
Automated Machine Learning: How do teams work together on an AutoML project? - Jan 2, 2020.
In this use case, available to the public on GitHub, we’ll see how a data scientist, project manager, and business lead at a retail grocer can leverage automated machine learning and Azure Machine Learning service to reduce product overstock. -
Predict Electricity Consumption Using Time Series Analysis - Jan 2, 2020.
Time series forecasting is a technique for the prediction of events through a sequence of time. In this post, we will be taking a small forecasting problem and try to solve it till the end learning time series forecasting alongside.