2019 Sep
All (97) | Courses, Education (2) | Meetings (5) | News (6) | Opinions (27) | Top Stories, Tweets (10) | Tutorials, Overviews (42) | Webcasts & Webinars (5)
- Will Machine Learning End Retail? Data Science Seattle Oct 17, 2019
- Sep 30, 2019.
In advance of the Data Science Salon taking place in Seattle on Oct 17, we asked our speakers to shed some light on how Artificial Intelligence and Machine Learning are impacting one of America’s most disruptive industries. Read for more insight, and then register with KDnuggets exclusive link for 20% off tickets.
-
How AI will transform healthcare (and can it fix the US healthcare system?) - Sep 30, 2019.
This thorough review focuses on the impact of AI, 5G, and edge computing on the healthcare sector in the 2020s as well as a look at quantum computing's potential impact on AI, healthcare, and financial services. - Top Stories, Sep 23-29: The Future of Analytics and Data Science; 5 Famous Deep Learning Courses/Schools of 2019
- Sep 30, 2019.
Also: 12 Deep Learning Researchers and Leaders; Natural Language in Python using spaCy: An Introduction; A Single Function to Streamline Image Classification with Keras; Which Data Science Skills are core and which are hot/emerging ones?; 6 bits of advice for Data Scientists
-
Know Your Data: Part 1 - Sep 30, 2019.
This article will introduce the different type of data sets, data object and attributes. - DeepMind Has Quietly Open Sourced Three New Impressive Reinforcement Learning Frameworks
- Sep 30, 2019.
Three new releases that will help researchers streamline the implementation of reinforcement learning programs.
- Webinar: Build auto-adaptive machine learning models with Kubernetes
- Sep 27, 2019.
This live webinar, Oct 2 2019, will instruct data scientists and machine learning engineers how to build manage and deploy auto-adaptive machine learning models in production. Save your spot now.
- Using Time Series Encodings to Discover Baseball History’s Most Interesting Seasons
- Sep 27, 2019.
Take me out to the ballgame! Take me out to the crowd! For the 2,829 seasons that have been played for 101 baseball teams since 1880, which seasons were unlike any others? Using SAX Encoding to recognize patterns in time series data, the most special years in baseball can be found.
- What is Hierarchical Clustering?
- Sep 27, 2019.
The article contains a brief introduction to various concepts related to Hierarchical clustering algorithm.
- Data Mapping Using Machine Learning
- Sep 27, 2019.
Data mapping is a way to organize various bits of data into a manageable and easy-to-understand system.
- Why data analysts should choose stories over statistics
- Sep 26, 2019.
Join the Crunch Data Conference in Budapest, Oct 16-18, with stellar speakers from companies like Facebook, Netflix and LinkedIn. Use the discount code ‘KDNuggets’ to save $100 off your conference ticket.
-
The Future of Analytics and Data Science - Sep 26, 2019.
Learn about the current and future issues of data science and possible solutions from this interview with IADSS Co-founder, Dr. Usama Fayyad following his keynote speech at ODSC Boston 2019. - Natural Language in Python using spaCy: An Introduction
- Sep 26, 2019.
This article provides a brief introduction to working with natural language (sometimes called “text analytics”) in Python using spaCy and related libraries.
- Customer Segmentation for R Users
- Sep 26, 2019.
This article shows you how to separate your customers into distinct groups based on their purchase behavior. For the R enthusiasts out there, I demonstrated what you can do with r/stats, ggradar, ggplot2, animation, and factoextra.
- Top KDnuggets tweets, Sep 18-24: Python Libraries for Interpretable Machine Learning; Scikit-Learn: A silver bullet for basic ML
- Sep 25, 2019.
Python Libraries for Interpretable Machine Learning; Scikit-Learn: A silver bullet for basic machine learning; I wasn't getting hired as a Data Scientist. So I sought data on who is; Which Data Science Skills are core and which are hot/emerging ones?
- Help Your Career Survive ‘DataGeddon’
- Sep 25, 2019.
Penn State’s fully online data analytics program uniquely prepares students to advance their career in data science. Penn State offers 3 intakes every year and reviews applications on a rolling basis. GMAT or GRE waivers are available to highly qualified candidates. Learn more now.
-
6 bits of advice for Data Scientists - Sep 25, 2019.
As a data scientist, you can get lost in your daily dives into the data. Consider this advice to be certain to follow in your work for being diligent and more impactful for your organization. - The thin line between data science and data engineering
- Sep 25, 2019.
Today, as companies have finally come to understand the value that data science can bring, more and more emphasis is being placed on the implementation of data science in production systems. And as these implementations have required models that can perform on larger and larger datasets in real-time, an awful lot of data science problems have become engineering problems.
- Beta Distribution: What, When & How
- Sep 25, 2019.
This article covers the beta distribution, and explains it using baseball batting averages.
- AI World Conference & Expo, Oct 23-25, Boston – Updated Agenda and Special KDnuggets Discount
- Sep 24, 2019.
AI World Conference & Expo has become the industry’s largest independent business event focused on the state of the practice of AI in the enterprise. Join us in Boston, Oct 23-25. Use the discount code 1968-KDN and SAVE $200.
- Automatic Version Control for Data Scientists
- Sep 24, 2019.
How can you keep your machine learning models and data organized so you can collaborate effectively? Discover this new tool set available for better version control designed for the data scientist workflow.
- Data Quality Assessment Is Not All Roses. What Challenges Should You Be Aware Of?
- Sep 24, 2019.
Of all data quality characteristics, we consider consistency and accuracy to be the most difficult ones to measure. Here, we describe the challenges that you may encounter and the ways to overcome them.
- A 2019 Guide for Automatic Speech Recognition
- Sep 24, 2019.
In this article, we’ll look at a couple of papers aimed at solving the problem of automated speech recognition with machine and deep learning.
-
5 Famous Deep Learning Courses/Schools of 2019 - Sep 24, 2019.
Deep Learning is/has become the hottest skill in Data Science at the moment. There is a plethora of articles, courses, technologies, influencers and resources that we can leverage to gain the Deep Learning skills. - Getting to the Future First: How Social Data is Transforming Trend Discovery
- Sep 23, 2019.
Register now for this webinar, Sep 25 @ 12 PM ET, for a clear approach on how to apply machine learning language technology to massive, unstructured data sets in order to create predictive models of what may be the next “it” ingredient, color, flavor or pack size.
-
12 Deep Learning Researchers and Leaders - Sep 23, 2019.
Our list of deep learning researchers and industry leaders are the people you should follow to stay current with this wildly expanding field in AI. From early practitioners and established academics to entrepreneurs and today’s top corporate influencers, this diverse group of individuals is leading the way into tomorrow’s deep learning landscape. - A Single Function to Streamline Image Classification with Keras
- Sep 23, 2019.
We show, step-by-step, how to construct a single, generalized, utility function to pull images automatically from a directory and train a convolutional neural net model.
- Top Stories, Sep 16-22: Which Data Science Skills are core and which are hot/emerging ones?
- Sep 23, 2019.
Also: Explore the world of Bioinformatics with Machine Learning; My journey path from a Software Engineer to BI Specialist to a Data Scientist; 5 Beginner Friendly Steps to Learn Machine Learning and Data Science with Python; 10 Great Python Resources for Aspiring Data Scientists
- Introducing IceCAPS: Microsoft’s Framework for Advanced Conversation Modeling
- Sep 23, 2019.
The new open source framework that brings multi-task learning to conversational agents.
- Beyond Explainability: A Practical Guide to Managing Risks in Machine Learning Models
- Sep 20, 2019.
This white paper provides the first-ever standard for managing risk in AI and ML, focusing on both practical processes and technical best practices “beyond explainability” alone. Download now.
- The Hidden Risk of AI and Big Data
- Sep 20, 2019.
With recent advances in AI being enabled through access to so much “Big Data” and cheap computing power, there is incredible momentum in the field. Can big data really deliver on all this hype, and what can go wrong?
- A Gentle Introduction to PyTorch 1.2
- Sep 20, 2019.
This comprehensive tutorial aims to introduce the fundamentals of PyTorch building blocks for training neural networks.
- Automate Hyperparameter Tuning for Your Models
- Sep 20, 2019.
When we create our machine learning models, a common task that falls on us is how to tune them. So that brings us to the quintessential question: Can we automate this process?
- Webinar: Data-Driven Approaches to Forecasting
- Sep 19, 2019.
Whether it’s demand forecasting, supply chain management, or any other application, getting it right requires balancing the need for performance with the constraints of implementation and complexity. Learn more in this free webinar, Data-Driven Approaches to Forecasting, Sep 26.
- Scikit-Learn & More for Synthetic Dataset Generation for Machine Learning
- Sep 19, 2019.
While mature algorithms and extensive open-source libraries are widely available for machine learning practitioners, sufficient data to apply these techniques remains a core challenge. Discover how to leverage scikit-learn and other tools to generate synthetic data appropriate for optimizing and fine-tuning your models.
- Applying Data Science to Cybersecurity Network Attacks & Events
- Sep 19, 2019.
Check out this detailed tutorial on applying data science to the cybersecurity domain, written by an individual with backgrounds in both fields.
- 5 Beginner Friendly Steps to Learn Machine Learning and Data Science with Python
- Sep 19, 2019.
“I want to learn machine learning and artificial intelligence, where do I start?” Here.
- Top KDnuggets tweets, Sep 11-17: Python Libraries for Interpretable Machine Learning
- Sep 18, 2019.
Also: Cartoon: Unsupervised #MachineLearning?; Cartoon: Unsupervised Machine Learning ? How to Become More Marketable as a Data Scientist; Ensemble Methods for Machine Learning: AdaBoost
- Python 2 End of Life Survey – Are You Prepared?
- Sep 18, 2019.
Support for Python 2 will expire on Jan. 1, 2020, after which the Python core language and many third-party packages will no longer be supported or maintained. Take this survey to help determine and share your level of preparation.
- The 5 Sampling Algorithms every Data Scientist need to know
- Sep 18, 2019.
Algorithms are at the core of data science and sampling is a critical technical that can make or break a project. Learn more about the most common sampling techniques used, so you can select the best approach while working with your data.
- Reddit Post Classification
- Sep 18, 2019.
This article covers the implementation of a data scraping and natural language processing project which had two parts: scrape as many posts from Reddit’s API as allowed &then use classification models to predict the origin of the posts.
- Data Science is Boring (Part 1)
- Sep 18, 2019.
Read about how one data scientist copes with his boring days of deploying machine learning.
-
Which Data Science Skills are core and which are hot/emerging ones? - Sep 17, 2019.
We identify two main groups of Data Science skills: A: 13 core, stable skills that most respondents have and B: a group of hot, emerging skills that most do not have (yet) but want to add. See our detailed analysis. - Turbo-Charging Data Science with AutoML
- Sep 17, 2019.
Join this technical webinar on Oct 3, where Domino Chief Data Scientist Josh Poduska will dive into popular open source and proprietary AutoML tools, and walk through hands-on examples of how to install and use these tools, so you can start using these technologies in your work right away.
- 5 Alternative Data Science Tools
- Sep 17, 2019.
What other creative tools for data science beyond Python and R can you use to make an impression? It's not about the tool -- it's about its impact.
-
Explore the world of Bioinformatics with Machine Learning - Sep 17, 2019.
The article contains a brief introduction of Bioinformatics and how a machine learning classification algorithm can be used to classify the type of cancer in each patient by their gene expressions. - BERT, RoBERTa, DistilBERT, XLNet: Which one to use?
- Sep 17, 2019.
Lately, varying improvements over BERT have been shown — and here I will contrast the main similarities and differences so you can choose which one to use in your research or application.
- How Bad is Multicollinearity?
- Sep 17, 2019.
For some people anything below 60% is acceptable and for certain others, even a correlation of 30% to 40% is considered too high because it one variable may just end up exaggerating the performance of the model or completely messing up parameter estimates.
- Data Science Symposium 2019, Oct 10-11, Cincinnati
- Sep 16, 2019.
The UC Center for Business Analytics will present the Data Science Symposium 2019 on Oct 10 & 11, featuring 3 keynote speakers and 16 tech talks/tutorials on a wide range of data science topics and tools.
-
My journey path from a Software Engineer to BI Specialist to a Data Scientist - Sep 16, 2019.
The career path of the Data Scientist remains a hot target for many with its continuing high demand. Becoming one requires developing a broad set of skills including statistics, programming, and even business acumen. Learn more about one person's experience making this journey, and discover the many resources available to help you find your way into a world of data science. - Top Stories, Sep 9-15: 10 Great Python Resources for Aspiring Data Scientists
- Sep 16, 2019.
Also: The 5 Graph Algorithms That Data Scientists Should Know; Many Heads Are Better Than One: The Case For Ensemble Learning; BERT is changing the NLP landscape; I wasn't getting hired as a Data Scientist; There is No Free Lunch in Data Science
- 5 Step Guide to Scalable Deep Learning Pipelines with d6tflow
- Sep 16, 2019.
How to turn a typical pytorch script into a scalable d6tflow DAG for faster research & development.
- What is Machine Behavior?
- Sep 16, 2019.
The new emerging field that wants to study AI agents the way social scientists study humans.
-
Cartoon: Unsupervised Machine Learning? - Sep 14, 2019.
New KDnuggets Cartoon looks at one of the hottest directions in Machine Learning and asks "Can Machine Learning be too unsupervised?" - Many Heads Are Better Than One: The Case For Ensemble Learning
- Sep 13, 2019.
While ensembling techniques are notoriously hard to set up, operate, and explain, with the latest modeling, explainability and monitoring tools, they can produce more accurate and stable predictions. And better predictions can be better for business.
- Version Control for Data Science: Tracking Machine Learning Models and Datasets
- Sep 13, 2019.
I am a Git god, why do I need another version control system for Machine Learning Projects?
- The State of Transfer Learning in NLP
- Sep 13, 2019.
This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP organized by Matthew Peters, Swabha Swayamdipta, Thomas Wolf, and Sebastian Ruder. This post highlights key insights and takeaways and provides updates based on recent work.
- Clearsense chooses Io-Tahoe’s Smart Data Discovery to navigate healthcare data challenges
- Sep 12, 2019.
Io-Tahoe, a pioneer in Smart Data Discovery and AI-Driven Data Catalog products, has announced that Clearsense, a scalable data platform as a service built for healthcare, has chosen the smart data discovery platform to automatically discover and catalog relationships across immense amounts of medical and clinical data.
-
There is No Free Lunch in Data Science - Sep 12, 2019.
There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science. - Ensemble Methods for Machine Learning: AdaBoost
- Sep 12, 2019.
It turned out that, if we ask the weak algorithm to create a whole bunch of classifiers (all weak for definition), and then combine them all, what may figure out is a stronger classifier.
- A Friendly Introduction to Support Vector Machines
- Sep 12, 2019.
This article explains the Support Vector Machines (SVM) algorithm in an easy way.
- Classification vs Prediction
- Sep 12, 2019.
It is important to distinguish prediction and classification. In many decision-making contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions.
- Top KDnuggets tweets, Sep 04-10: How #AI will transform #healthcare; 10 Great Python Resources for Aspiring Data Scientists
- Sep 11, 2019.
Python Libraries for Interpretable Machine Learning; How #AI will transform #healthcare (and can it fix US healthcare system?); Building Recommendation System - an overview ; I wasn't getting hired as a Data Scientist. So I sought data on who is.
- Data Driven Government – Agenda, Washington, DC, Sep 25
- Sep 11, 2019.
Data Driven Government is coming to Washington, DC, Sep 26, and includes a stellar lineup of experts who will share the emerging trends and best practices of government agencies in the current use of data analytics to enhance mission outcomes. Use code KDNUGGETS to get 15% off.
- Can graph machine learning identify hate speech in online social networks?
- Sep 11, 2019.
Online hate speech is a complex subject. Follow this demonstration using state-of-the-art graph neural network models to detect hateful users based on their activities on the Twitter social network.
-
Train sklearn 100x Faster - Sep 11, 2019.
As compute gets cheaper and time to market for machine learning solutions becomes more critical, we’ve explored options for speeding up model training. One of those solutions is to combine elements from Spark and scikit-learn into our own hybrid solution. - Top August Stories: How to Become More Marketable as a Data Scientist
- Sep 10, 2019.
Also: Top Handy SQL Features for Data Scientists; 12 NLP Researchers, Practitioners & Innovators You Should Be Following; Knowing Your Neighbours: Machine Learning on Graphs
- Discover Your Path Toward Data Science with ODSC’s Mini-Bootcamp
- Sep 10, 2019.
ODSC has developed a mini-bootcamp, designed to reduce the time and monetary costs of discovering which pathway into data science you should take. In this article, we’ll discuss seven reasons why ODSC’s Mini-Bootcamp might be right for you.
- Scikit-Learn vs mlr for Machine Learning
- Sep 10, 2019.
How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.
-
The 5 Graph Algorithms That Data Scientists Should Know - Sep 10, 2019.
In this post, I am going to be talking about some of the most important graph algorithms you should know and how to implement them using Python. - Common Machine Learning Obstacles
- Sep 9, 2019.
In this blog, Seth DeLand of MathWorks discusses two of the most common obstacles relate to choosing the right classification model and eliminating data overfitting.
- Top Stories, Sep 2-8: I wasn’t getting hired as a Data Scientist. So I sought data on who is.
- Sep 9, 2019.
Also: Python Libraries for Interpretable Machine Learning; TensorFlow vs PyTorch vs Keras for NLP; Advice on building a machine learning career and reading research papers by Prof. Andrew Ng; Object-oriented programming for data scientists: Build your ML estimator
-
BERT is changing the NLP landscape - Sep 9, 2019.
BERT is changing the NLP landscape and making chatbots much smarter by enabling computers to better understand speech and respond intelligently in real-time. - A 2019 Guide to Speech Synthesis with Deep Learning
- Sep 9, 2019.
In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.
- OpenStreetMap Data to ML Training Labels for Object Detection
- Sep 9, 2019.
I am really interested in creating a tight, clean pipeline for disaster relief applications, where we can use something like crowd sourced building polygons from OSM to train a supervised object detector to discover buildings in an unmapped location.
- How DeepMind and Waymo are Using Evolutionary Competition to Train Self-Driving Vehicles
- Sep 9, 2019.
Recently, Alphabet’s subsidiaries Waymo and DeepMind partnered to find a more efficient process to train self-driving vehicles algorithms and their work took them back to one of the cornerstones of our history as species: evolution.
-
10 Great Python Resources for Aspiring Data Scientists - Sep 9, 2019.
This is a collection of 10 interesting resources in the form of articles and tutorials for the aspiring data scientist new to Python, meant to provide both insight and practical instruction when starting on your journey. - Designing Dashboards that Users Actually Like – Free Webcast
- Sep 6, 2019.
See how creating a system of purpose-specific displays enables users to quickly get answers to their data-related questions.
- What’s the difference between analytics and statistics?
- Sep 6, 2019.
From asking the best questions about data to answering those questions with certainty, understanding the value of these two seemingly different professions is clarified when you see how they should work together.
-
I wasn’t getting hired as a Data Scientist. So I sought data on who is. - Sep 6, 2019.
Instead of focusing on skills thought to be required of data scientists, we can look at what they have actually done before. - Build Your First Voice Assistant
- Sep 6, 2019.
Hone your practical speech recognition application skills with this overview of building a voice assistant using Python.
- TensorFlow Optimization Showdown: ActiveState vs. Anaconda
- Sep 5, 2019.
In this TensorFlow tutorial, you’ll learn the impact of optimizing both operators and entire graphs, how to efficiently organize data in training and testing datasets to minimize data shuffling, and how to identify a well-optimized model using Anaconda and ActivePython.
- 3 Ways to Manage Human Bias in the Analytics Process
- Sep 5, 2019.
Managing human bias is an important part of the analytics process. Learn about three areas to watch out for to ensure your models are as unbiased as possible.
- Automated Machine Learning: Just How Much?
- Sep 5, 2019.
This is an interview between Rosaria Silipo and data scientists Paolo Tamagnini, Simon Schmid and Christian Dietz, asking a few questions on the topic of automated machine learning from their point of view, and some interesting examples of its practical use.
-
Advice on building a machine learning career and reading research papers by Prof. Andrew Ng - Sep 5, 2019.
This blog summarizes the career advice/reading research papers lecture in the CS230 Deep learning course by Stanford University on YouTube, and includes advice from Andrew Ng on how to read research papers. - Top KDnuggets tweets, Aug 28 – Sep 03: The 8 Neural Network Architectures #MachineLearning Researchers Need to Learn
- Sep 4, 2019.
Also: The secret sauce for growing from a data analyst to a data scientist; 4 Tips for Advanced Feature Engineering and Preprocessing; R Users’ Salaries from the 2019 Stackoverflow Survey; Emoji Analytics
- Learn Quantum Computing with Python and Q#, Get Programming with Python, Data Science with Python and Dask
- Sep 4, 2019.
Save 40% on Get Programming with Python, Data Science with Python and Dask, and Learn Quantum Computing with Python and Q# with code nlpython40.
- An Easy Introduction to Machine Learning Recommender Systems
- Sep 4, 2019.
Recommender systems are an important class of machine learning algorithms that offer "relevant" suggestions to users. Categorized as either collaborative filtering or a content-based system, check out how these approaches work along with implementations to follow from example code.
-
Python Libraries for Interpretable Machine Learning - Sep 4, 2019.
In the following post, I am going to give a brief guide to four of the most established packages for interpreting and explaining machine learning models. - An Overview of Topics Extraction in Python with Latent Dirichlet Allocation
- Sep 4, 2019.
A recurring subject in NLP is to understand large corpus of texts through topics extraction. Whether you analyze users’ online reviews, products’ descriptions, or text entered in search bars, understanding key topics will always come in handy.
- Starting out in Data Science? Top tips and advice from DataScienceGO Speakers
- Sep 3, 2019.
DataScienceGO returns to San Diego Sep 27-29, for a three-day career-focused conference designed to unite newcomers, practitioners, managers and executives under one umbrella, speakers weigh in on how to forge the best teams, increase your hiring chances, and prepare for the future.
-
TensorFlow vs PyTorch vs Keras for NLP - Sep 3, 2019.
These three deep learning frameworks are your go-to tools for NLP, so which is the best? Check out this comparative analysis based on the needs of NLP, and find out where things are headed in the future. - Beyond Neurons: Five Cognitive Functions of the Human Brain that we are Trying to Recreate with Artificial Intelligence
- Sep 3, 2019.
The quest for recreating cognitive capabilities of the brain in deep neural networks remains one of the elusive goals of AI. Let’s explore some human cognitive skills that are serving as inspiration to a new generation of AI techniques.
- Automate your Python Scripts with Task Scheduler: Windows Task Scheduler to Scrape Alternative Data
- Sep 3, 2019.
In this tutorial, you will learn how to run task scheduler to web scrape data from Lazada (eCommerce) website and dump it into SQLite RDBMS Database.
- 6 Tips for Building a Training Data Strategy for Machine Learning
- Sep 2, 2019.
Without a well-defined approach for collecting and structuring training data, launching an AI initiative becomes an uphill battle. These six recommendations will help you craft a successful strategy.
- Cartoon: Labor Day in the age of AI
- Sep 2, 2019.
KDnuggets cartoon looks at how AI will impact Labor Day in the year 2050.
- Top Stories, Aug 26 – Sep 1: Object-oriented programming for data scientists; Why Data Visualization Is The Most Important Skill in a Data Analyst Arsenal
- Sep 2, 2019.
Also: Types of Bias in Machine Learning; Deep Learning Next Step: Transformers and Attention Mechanism; New Poll: Data Science Skills; R Users Salaries from the 2019 Stackoverflow Survey; How to Sell Your Boss on the Need for Data Analytics
- Top 10 Data Science Use Cases in Energy and Utilities
- Sep 2, 2019.
In this article, we will consider the most vivid data science use cases in the industry of energy and utilities.