News / Blog
- Top KDnuggets tweets, Feb 12-18: What Does it Mean to Deploy a #MachineLearning Model? - Feb 19, 2020.
Also: A minimalist drawing that represents closeness over time. Captures the span of life with bittersweet accuracy; How much is a Data Scientist's salary in 2020?; Great set of #NLP Interview Questions #DeepLearning; 12-Hour Machine Learning Challenge: Build & deploy an app with Streamlit and DevOps tools
- 2020 INFORMS Business Analytics Conference: Where the Wild West meets the future of analytics innovation - Feb 19, 2020.
From April 26-28, more than 1,000 leading analytics professionals and industry experts will gather in Denver to explore the newest mathematical solutions to some of industry’s largest challenges.
- Hand labeling is the past. The future is #NoLabel AI - Feb 19, 2020.
Data labeling is so hot right now… but could this rapidly emerging market face disruption from a small team at Stanford and the Snorkel open source project, which enables highly efficient programmatic labeling that is 10 to 1,000x as efficient as hand labeling?
- Getting Started with R Programming - Feb 19, 2020.
An end to end Data Analysis using R, the second most requested programming language in Data Science.
- Audio Data Analysis Using Deep Learning with Python (Part 1) - Feb 19, 2020.
A brief introduction to audio data processing and genre classification using Neural Networks and python.
- KDnuggets™ News 20:n07, Feb 19: 20 AI, Data Science, Machine Learning Terms for 2020; Why Did I Reject a Data Scientist Job? - Feb 19, 2020.
This week on KDnuggets: 20 AI, Data Science, Machine Learning Terms You Need to Know in 2020; Why Did I Reject a Data Scientist Job?; Fourier Transformation for a Data Scientist; Math for Programmers; Deep Neural Networks; Practical Hyperparameter Optimization; and much more!
- Hot topics at PAW Healthcare: Predicting Ebola Outbreaks, Improving Hospital Patient Flow & more - Feb 18, 2020.
Predictive Analytics World for Healthcare, May 31-Jun 4 in Las Vegas, is packed with sessions across Healthcare Business Operations and Clinical applications. Witness how data science and machine learning are employed at leading enterprises, resulting in improved outcomes, lower costs, and higher patient satisfaction. Use the code KDNUGGETS for a 15% discount on your Deep Learning World ticket.
- 20 AI, Data Science, Machine Learning Terms You Need to Know in 2020 (Part 1) - Feb 18, 2020.
2020 is well underway, and we bring you 20 AI, data science, and machine learning terms we should all be familiar with as the year marches onward.
- Google’s Data Science Interview Brain Teasers - Feb 18, 2020.
Applying for jobs a the biggest tech companies may sound intimidating because of the many stories of brain teasers and trick questions posed during interview. Here, we share simple, intuitive explanations of some of Google's "Problem Solving" questions for data science interviews.
- Using the Fitbit Web API with Python - Feb 18, 2020.
Fitbit provides a Web API for accessing data from Fitbit activity trackers. Check out this updated tutorial to accessing this Fitbit data using the API with Python.
- Seize Your New Career in Data Science - Feb 17, 2020.
Springboard’s mission has always been to enable everyone to attain their full potential by preparing students for the ever-changing world around them You can start working towards your dream data science career and land a new role by the end of summer.
- Scaling the Wall Between Data Scientist and Data Engineer - Feb 17, 2020.
The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.
- Using AI to Identify Wildlife in Camera Trap Images from the Serengeti - Feb 17, 2020.
With recent developments in machine learning and computer vision, we acquired the tools to provide the biodiversity community with an ability to tap the potential of the knowledge generated automatically with systems triggered by a combination of heat and motion.
- Top Stories, Feb 10-16: Why Did I Reject a Data Scientist Job?; Fourier Transformation for a Data Scientist - Feb 17, 2020.
Also: Math for Programmers – your guide for solving math problems in code; What Does it Mean to Deploy a Machine Learning Model?; Deep Neural Networks; Easy Image Dataset Augmentation with TensorFlow; Intent Recognition with BERT using Keras and TensorFlow 2
- Inside The Machine Learning that Google Used to Build Meena: A Chatbot that Can Chat About Anything - Feb 17, 2020.
Meena is one of the major milestones in the history of NLU. How did Google build it?
- Deep Neural Networks - Feb 14, 2020.
We examine the features and applications of a deep neural network.
- What Does it Mean to Deploy a Machine Learning Model? - Feb 14, 2020.
You are a Data Scientist who knows how to develop machine learning models. You might also be a Data Scientist who is too afraid to ask how to deploy your machine learning models. The answer isn't entirely straightforward, and so is a major pain point of the community. This article will help you take a step in the right direction for production deployments that are automated, reproducible, and auditable.
- Fourier Transformation for a Data Scientist - Feb 14, 2020.
The article contains a brief intro into Fourier transformation mathematically and its applications in AI.
- Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau - Feb 14, 2020.
When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day
- Analytics Summit 2020: Real World Applications of Business Analytics, April 6-8, Cincinnati - Feb 13, 2020.
The 8th annual Analytics Summit 2020, sponsored by the University of Cincinnati’s Center for Business Analytics, will be held on Apr 6-8, including two analytics training days and a Conference featuring speakers presenting real world applications of data science and business analytics.
- Adversarial Validation Overview - Feb 13, 2020.
Learn how to implement adversarial validation that builds a classifier to determine if your data is from the training or testing sets. If you can do this, then your data has issues, and your adversarial validation model can help you diagnose the problem.
- Practical Hyperparameter Optimization - Feb 13, 2020.
An introduction on how to fine-tune Machine and Deep Learning models using techniques such as: Random Search, Automated Hyperparameter Tuning and Artificial Neural Networks Tuning.
- Easy Image Dataset Augmentation with TensorFlow - Feb 13, 2020.
What can we do when we don't have a substantial amount of varied training data? This is a quick intro to using data augmentation in TensorFlow to perform in-memory image transformations during model training to help overcome this data impediment.
- Top KDnuggets tweets, Feb 05-11: #SciPy 1.0: fundamental algorithms for scientific computing in #Python; Why is Data Science so popular? - Feb 12, 2020.
Why is Data Science so Popular?; Visual Paper Summary: ALBERT (A Lite BERT); Uber Has Assembled One of the Most Impressive Open Source DL Stacks; Top #AI Influencers To Follow in 2020
- Math for Programmers – your guide for solving math problems in code - Feb 12, 2020.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer.
- Why Did I Reject a Data Scientist Job? - Feb 12, 2020.
Snagging that job as a Data Scientist might not be exactly what you were expecting. Consider this advice on carefully considering job titles with what the position might really be like day-to-day.
- Sharing your machine learning models through a common API - Feb 12, 2020.
DEEPaaS API is a software component developed to expose machine learning models through a REST API. In this article we describe how to do it.
- Illustrating the Reformer - Feb 12, 2020.
In this post, we will try to dive into the Reformer model and try to understand it with some visual guides.
- KDnuggets™ News 20:n06, Feb 12: The Data Science Puzzle – 2020 Edition; The Future of ML Will Include a Lot Less Engineering - Feb 12, 2020.
Read about the Data Science Puzzle for 2020; Why the Future of ML will include a lot less Engineering; a great tutorial on density-based clustering; Optimal estimation algorithms; Intro to AI and ML based on high-school math; and more.
- Top January Stories: How to land a Data Scientist job at your dream company; I wanna be a data scientist, but … how? - Feb 11, 2020.
Also: The Book to Start You on Machine Learning; Top 5 must-have Data Science skills for 2020.
- Fidelity on How to Find a Tailor-Fit Unicorn Data Scientist - Feb 11, 2020.
Predictive Analytics World for Financial Services in Las Vegas, May 31-Jun 4 is honored to host an exceptional keynote by Fidelity Investments’ AI and Data Science Center of Excellence Leader, Victor Lo: "How to Find a Tailor-Fit 'Unicorn' Data Scientist for Financial Services". Use the code KDNUGGETS for a 15% discount on your Predictive Analytics World ticket.
- How to learn data science on your own: a practical guide - Feb 11, 2020.
While much focus today is on the rise in working from home and the challenges experienced, not as much is said about learning from home. For those lone wolfs studying Data Science in a self-directed way, a range of issues can get in the way of your goal. Learn about these common problems to prepare to focus yourself all the way to your educational goals.
- Basics of Audio File Processing in R - Feb 11, 2020.
This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.
- Recommender System Metrics: Comparing Apples, Oranges and Bananas - Feb 11, 2020.
This article will discuss a sometimes-overlooked aspect of what distinguishes recommender systems from other machine learning tasks: added uncertainties of measuring them.
- Observability for Data Engineering - Feb 10, 2020.
Going beyond traditional monitoring techniques and goals, understanding if a system is working as intended requires a new concept in DevOps, called Observability. Learn more about this essential approach to bring more context to your system metrics.
- Intent Recognition with BERT using Keras and TensorFlow 2 - Feb 10, 2020.
TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!
- Top Stories, Feb 3-9: 12-Hour Machine Learning Challenge: Build & deploy an app with Streamlit and DevOps tools; The Future of Machine Learning Will Include a Lot Less Engineering - Feb 10, 2020.
Also: The Data Science Puzzle — 2020 Edition; Top 5 Data Science Trends for 2020; How to land a Data Scientist job at your dream company; Audio File Processing: ECG Audio Using Python
- Amazon Uses Self-Learning to Teach Alexa to Correct its Own Mistakes - Feb 10, 2020.
The digital assistant incorporates a reformulation engine that can learn to correct responses in real time based on customer interactions.
- AI and Machine Learning In Our Every Day Life - Feb 7, 2020.
The curiosity and buzz around the most talked-about technology -- Artificial Intelligence -- have experts and technophiles busy decoding its exciting future applications. Of course, the use of AI and machine learning is already pervasive in our daily lives, as we review many of these popular features in this article.
- Large Scale Adversarial Representation Learning - Feb 7, 2020.
GANs can be used for unsupervised learning where a generator maps latent samples to generate data, but this framework does not include an inverse mapping from data to latent representation. BiGAN adds an encoder E to the standard generator-discriminator GAN architecture — the encoder takes input data x and outputs a latent representation z of the input.
- The Data Science Puzzle — 2020 Edition - Feb 7, 2020.
The data science puzzle is once again re-examined through the relationship between several key concepts of the landscape, incorporating updates and observations since last time. Check out the results here.
- Understanding Density-based Clustering - Feb 6, 2020.
HDBSCAN is a robust clustering algorithm that is very useful for data exploration, and this comprehensive introduction provides an overview of its fundamental ideas from a high-level view above the trees to down in the weeds.
- The Future of Machine Learning Will Include a Lot Less Engineering - Feb 6, 2020.
Despite getting less attention, the systems-level design and engineering challenges in ML are still very important — creating something useful requires more than building good models, it requires building good systems.
- Getting up and Running with Python: Installing Anaconda on Windows - Feb 6, 2020.
This tutorial covers how to download and install Anaconda on Windows; how to test your installation; how to fix common installation issues; and what to do after installing Anaconda.
- Top KDnuggets tweets, Jan 29 – Feb 04: 30 Python Best Practices, Tips, And Tricks; 7 Books to Grasp Math of Data Science and ML - Feb 5, 2020.
The cost of obtaining a MSc in #DataScience in Europe; 30 Python Best Practices, Tips, And Tricks; OpenAI is Adopting PyTorch; I wanna be a data scientist, but... how?
- Intro to Machine Learning and AI based on high school knowledge - Feb 5, 2020.
Machine learning information is becoming pervasive in the media as well as a core skill in new, important job sectors. Getting started in the field can require learning complex concepts, and this article outlines an approach on how to begin learning about these exciting topics based on high school knowledge.
- Create Your Own Computer Vision Sandbox - Feb 5, 2020.
This post covers a wide array of computer vision tasks, from automated data collection to CNN model building.
- Optimal Estimation Algorithms: Kalman and Particle Filters - Feb 5, 2020.
An introduction to the Kalman and Particle Filters and their applications in fields such as Robotics and Reinforcement Learning.
- KDnuggets™ News 20:n05, Feb 5: How to land a Data Scientist job at your dream company; 12-Hour Machine Learning Challenge - Feb 5, 2020.
Read how one very persistent data scientist got a job at her dream company; Learn how to complete a 12 hour ML challenge; Top 10 AI, Machine Learning research papers; How to optimize your Jupyter notebook; and PyTorch FTW.
- Do You Trust and Understand Your Predictive Models? - Feb 4, 2020.
To help practitioners make the most of recent and disruptive breakthroughs in debugging, explainability, fairness, and interpretability techniques for machine learning read “An Introduction to Machine Learning Intrepretability Second Edition”. Download this report now.
- Top 5 Data Science Trends for 2020 - Feb 4, 2020.
As Data Science continues to expand into the next decade, this article features five important trends in the field that are expected in 2020. Leverage these trends to help improve your business processes for maximizing growth.
- Audio File Processing: ECG Audio Using Python - Feb 4, 2020.
In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.
- Serverless Machine Learning with R on Cloud Run - Feb 4, 2020.
Expedite the deployment of your machine models using serverless cloud infrastructure. In this tutorial, we explore creating and deploying a model which scraps real time Twitter data and returns interactive visualization using R.
- Why are Machine Learning Projects so Hard to Manage? - Feb 3, 2020.
What makes deploying a machine learning project so difficult? Is it the expectations? The people? The tech? There are common threads to these challenges, and best practices exist to deal with them.
- Top Stories, Jan 27 – Feb 2: How to land a Data Scientist job at your dream company; How to Optimize Your Jupyter Notebook - Feb 3, 2020.
Also: Data Validation for Machine Learning; OpenAI is Adopting PyTorch… They Aren’t Alone; I wanna be a data scientist, but how?; Top 10 AI, Machine Learning Research Articles to know; Google Dataset Search Provides Access to 25 Million Datasets
- Microsoft Open Sources Jericho to Train Reinforcement Learning Using Linguistic Games - Feb 3, 2020.
The new framework provides an OpenAI-like environment for language-based games.
- 12-Hour Machine Learning Challenge: Build & deploy an app with Streamlit and DevOps tools - Feb 3, 2020.
This article will present the knowledge, process, tools, and frameworks required for completing a 12-hour ML challenge. I hope you can find it useful for your personal or professional projects.
- Data Validation for Machine Learning - Jan 31, 2020.
While the validation process cannot directly find what is wrong, the process can show us sometimes that there is a problem with the stability of the model.
- How to land a Data Scientist job at your dream company - Jan 31, 2020.
Job hunting for anyone just starting out as a data scientist can require grit, passion, and perseverance before finding the best opportunity. Follow this career search journey to learn what it took -- and the learning resources used -- to land the dream job.
- OpenAI is Adopting PyTorch… They Aren’t Alone - Jan 31, 2020.
OpenAI is moving to PyTorch for the bulk of their research work. This might be a high-profile adoption, but it is far from the only such example.
- Geek & Chic: Analytics redefining fashion instincts - Jan 31, 2020.
To effectively start with fashion retail analytics, players in the fashion retail sector need to first decide where analytics will help them achieve the greatest business impact.
- Top 10 AI, Machine Learning Research Articles to know - Jan 30, 2020.
We’ve seen many predictions for what new advances are expected in the field of AI and machine learning. Here, we review a “data set” based on what researchers were apparently studying at the turn of the decade to take a fresh glimpse into what might come to pass in 2020.
- How to Optimize Your Jupyter Notebook - Jan 30, 2020.
This article walks through some simple tricks on improving your Jupyter Notebook experience, and covers useful shortcuts, adding themes, automatically generated table of contents, and more.
- Amazon Gets Into the AutoML Race with AutoGluon: Some AutoML Architectures You Should Know About - Jan 30, 2020.
Amazon, Microsoft, Salesforce, Waymo have produced some of the most innovative AutoML architectures in the market.
- Top KDnuggets tweets, Jan 22-28: The 5 Most Useful Techniques to Handle Imbalanced Datasets - Jan 29, 2020.
Also: Global #AI Index ranks 54 countries: US is currently ahead. China is second, but growing faster; RStudio Projects and Working Directories: A Beginner's Guide #rstats; The Book to Start You on Machine Learning; Data Scientist Archetypes
- Top 25 Session Highlights at ODSC East 2020 - Jan 29, 2020.
ODSC East is back in Boston, Apr 13-17, 2020. Preliminary schedule is a unique collection of the leading experts and rising stars of data science. Register soon, as our 50% discount ends this Friday, Jan 31!
- Managing Machine Learning Cycles: Five Learnings from comparing Data Science Experimentation/ Collaboration Tools - Jan 29, 2020.
Machine learning projects require handling different versions of data, source code, hyperparameters, and environment configuration. Numerous tools are on the market for managing this variety, and this review features important lessons learned from an ongoing evaluation of the current landscape.
- Generating English Pronoun Questions Using Neural Coreference Resolution - Jan 29, 2020.
This post will introduce a practical method for generating English pronoun questions from any story or article. Learn how to take an additional step toward computationally understanding language.
- Google Dataset Search Provides Access to 25 Million Datasets - Jan 29, 2020.
Google's dataset search is out of beta, and provides centralized access to 25 million datasets.
- KDnuggets™ News 20:n04, Jan 29: AutoML: If you try it, you’ll like it more; The Data Science Interview Study Guide - Jan 29, 2020.
AutoML Poll results: if you try it, you'll like it more; The Data Science Interview Study Guide; What Do Data Scientists in Europe Do & How Much Are They Worth?; 2 Questions for a Junior Data Scientist
- A bird’s-eye view of modern AI from NeurIPS 2019 - Jan 28, 2020.
With the explosion of the field of AI/ML impacting so many applications and industries, there is great value coming out of recent progress. This review highlights many research areas covered at the NeurIPS 2019 conference recently held in Vancouver, Canada, and features many important areas of progress we expect to see in the coming year.
- Data Scientist Archetypes - Jan 28, 2020.
My goal here is to give you a map for navigating the sprawling terrain of data science. It’s to help you prioritize what you want to learn and what you want to do, so you don’t feel lost.
- Exoplanet Hunting Using Machine Learning - Jan 28, 2020.
Search for exoplanets — those planets beyond our own solar system — using machine learning, and implement these searches in Python.
- AutoML Poll results: if you try it, you’ll like it more - Jan 27, 2020.
The results of latest KDnuggets Poll on AutoML are quite interesting. While most respondents were not happy with AutoML performance, the opinions of those who tried it were higher than those who did not.
- Agenda Sneak Peek for PAW for Industry 4.0 | Munich, 11-12 May - Jan 27, 2020.
Predictive Analytics World for Industry 4.0 is coming closer and closer. Take advantage of the Early Bird price until Feb 14! Use code KDNUGGETS for a 15% discount on your Predictive Analytics World ticket.
- The Decade of Data Science - Jan 27, 2020.
With the last decade being so strong for the emerging field of Data Science, this review considers current trends in the industry, popular frameworks, helpful tools, and new tools that can be leveraged more in the future.
- Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in the Market - Jan 27, 2020.
Many of the technologies used by Uber teams have been open sourced and received accolades from the machine learning community. Let’s look at some of my favorites.
- Top Stories, Jan 20-26: I wanna be a data scientist, but how? - Jan 27, 2020.
Also: Microsoft Introduces Project Petridish to Find the Best Neural Network for Your Problem; 10 Python String Processing Tips & Tricks; Random Forest — A Powerful Ensemble Learning Algorithm; Top 10 Technology Trends for 2020; The Book to Start You on Machine Learning
- You’re Fired: How to develop and manage a happy data science team - Jan 27, 2020.
I want to share a solution called Insight-Driven Development (IDD), a few examples of it, and five steps to adopting it. IDD aims to create a high performing, engaged, and happy Data Science teams that embrace non-ML work as much as the fun ML stuff.
- 2 Questions for a Junior Data Scientist - Jan 24, 2020.
Academic credentials and experience with previous machine learning projects are important for kicking off a data science career. However, landing your first job out of school will require you to extend your thinking about projects and problems. Learn how one interviewer honed in on desired skills by considering these two questions.
- Semi-supervised learning with Generative Adversarial Networks - Jan 24, 2020.
The paper discussed in this post, Semi-supervised learning with Generative Adversarial Networks, utilizes a GAN architecture for multi-label classification.
- Top 7 Location Intelligence Companies in 2020 - Jan 24, 2020.
Here’s a complete list of top 7 location intelligence companies in the market - an overview, pricing, pros, and cons that’ll help you identify the right location intelligence tool for your business.
- PAW Healthcare Sneak Peek Agenda | Munich, 11-12 May - Jan 23, 2020.
Visit Deep Learning World, 11-12 May in Munich, to broaden your knowledge, deepen your understanding and discuss your questions with other Deep Learning experts!
- How to Get Started With Algorithmic Finance - Jan 23, 2020.
Algorithmic finance has been around for decades as a money-making tool, and it's not magic. Learn about some practical strategies along with and introduction to code you can use to get started.
- What Do Data Scientists in Europe Do & How Much Are They Worth? - Jan 23, 2020.
Interested in knowing what a data scientist is worth in Europe, and what one does? Read this overview of a recent survey on the topic and gain some insight into the European data science professional job market.
- NLP Year in Review — 2019 - Jan 23, 2020.
In this blog post, I want to highlight some of the most important stories related to machine learning and NLP that I came across in 2019.
- Top KDnuggets tweets, Jan 15-21: My Pandas Cheat Sheet; 5 Key Reasons Why Data Scientists Are Quitting their Jobs - Jan 22, 2020.
5 Key Reasons Why Data Scientists Are Quitting their Jobs; My Pandas Cheat Sheet; Google Colab: Jupyter Lab on steroids (perfect for Deep Learning); Top 5 Must-have Data Science Skills.
- Big Data. Big Impact - Jan 22, 2020.
Ramapo College’s Master of Science in Data Science program will teach you to collect, synthesize, and analyze big data, become skilled in programming languages like R and Python, and leverage advanced tools to meet the demands of modern business and science.
- The Data Science Interview Study Guide - Jan 22, 2020.
Preparing for a job interview can be a full-time job, and Data Science interviews are no different. Here are 121 resources that can help you study and quiz your way to landing your dream data science job.
- The 5 Most Useful Techniques to Handle Imbalanced Datasets - Jan 22, 2020.
This post is about explaining the various techniques you can use to handle imbalanced datasets.