2019 Oct
All (93) | Courses, Education (2) | Meetings (5) | News (4) | Opinions (23) | Top Stories, Tweets (10) | Tutorials, Overviews (48) | Webcasts & Webinars (1)
- How to Make an Agile Team Work for Big Data Analytics
- Oct 31, 2019.
Learn how to approach the challenges when merging an agile methodology into a data science team to bring out the best value for your Big Data products.
- How Data Labeling Facilitates AI Models
- Oct 31, 2019.
AI-based models are highly dependent on accurate, clean, well-labeled, and prepared data in order to produce the desired output and cognition. These models are fed with bulky datasets covering an array of probabilities and computations to make its functioning as smart and gifted as human intelligence.
- How to Build Your Own Logistic Regression Model in Python
- Oct 31, 2019.
A hands on guide to Logistic Regression for aspiring data scientist and machine learning engineer.
- Top KDnuggets tweets, Oct 23-29: End To End Guide For Machine Learning Project – Explained
- Oct 30, 2019.
Also: Highest paid positions in 2019 are DevOps, Data Scientist, Data Engineer (all over $100K) - Stack Overflow Salary Calculator, Updated; A neural net solves the three-body problem 100 million times faster; The Last SQL Guide for Data Analysis You’ll Ever Need; How YouTube is Recommending Your Next Video
- AutoML for Temporal Relational Data: A New Frontier
- Oct 30, 2019.
While AutoML started out as an automation approach to develop optimal machine learning pipelines, extensions of AutoML to Data Science embedded products can now enable the processing of much more, including temporal relational data.
- Research Guide for Transformers
- Oct 30, 2019.
The problem with RNNs and CNNs is that they aren’t able to keep up with context and content when sentences are too long. This limitation has been solved by paying attention to the word that is currently being operated on. This guide will focus on how this problem can be addressed by Transformers with the help of deep learning.
-
5 Statistical Traps Data Scientists Should Avoid - Oct 30, 2019.
Here are five statistical fallacies — data traps — which data scientists should be aware of and definitely avoid. -
Why is Machine Learning Deployment Hard? - Oct 29, 2019.
Developing an excellent machine learning model is one thing. Deploying it to production is another. Consider these lessons learned and recommendations for approaching this important challenge to help ensure value from your AI work. - About Google’s Self-Proclaimed Quantum Supremacy and its Impact on Artificial Intelligence
- Oct 29, 2019.
Google claimed quantum supremacy, IBM challenged it… but the development is really important for the future of AI.
- How to Extend Scikit-learn and Bring Sanity to Your Machine Learning Workflow
- Oct 29, 2019.
In this post, learn how to extend Scikit-learn code to make your experiments easier to maintain and reproduce.
- Top Stories, Oct 21-27: Everything a Data Scientist Should Know About Data Management; How YouTube is Recommending Your Next Video
- Oct 28, 2019.
Also: Introduction to Natural Language Processing (NLP); Anomaly Detection, A Key Task for AI and Machine Learning, Explained; How to Become a (Good) Data Scientist — Beginner Guide
- Data Sources 101
- Oct 28, 2019.
Data collection is one of the first steps of the data lifecycle — you need to get all the data you require in the first place. To collect the right data, you need to know where to find it and determine the effort involved in collecting it. This article answers the most basic question: where does all the data you need (or might need) come from?
- DataTech20 Seeking Speaker Submissions (16 March 2020, Glasgow)
- Oct 28, 2019.
DataTech is a one-day conference on 16 Mar 2020, at the Technology and Innovation Centre in Glasgow, focusing on key topics in data science, and welcoming members of industry, academia, and the public sector alike. DataTech provides a forum for these different communities to meet, share knowledge and expertise, and forge new collaborations. We are currently welcoming workshop, talk and poster proposals for the DataTech20 conference.
- How Bayes’ Theorem is Applied in Machine Learning
- Oct 28, 2019.
Learn how Bayes Theorem is in Machine Learning for classification and regression!
- DeepMind is Using This Old Technique to Evaluate Fairness in Machine Learning Models
- Oct 28, 2019.
Visualizing the datasets is an essential component to identify potential sources of bias and unfairness. DeepMind relied on a method called Causal Bayesian networks (CBNs) to represent and estimate unfairness in a dataset.
- 5 Advanced Features of Pandas and How to Use Them
- Oct 25, 2019.
The pandas library offers core functionality when preparing your data using Python. But, many don't go beyond the basics, so learn about these lesser-known advanced methods that will make handling your data easier and cleaner.
- Harnessing Semiotics and Discourse Communities to Understand User Intent
- Oct 25, 2019.
Semiotics helps us understand the importance of context to determining the meaning of a term and discourse communities provide us with the background context (mental model) by which to correctly interpret its meaning correctly.
- Introduction to Natural Language Processing (NLP)
- Oct 25, 2019.
Have you ever wondered how your personal assistant (e.g: Siri) is built? Do you want to build your own? Perfect! Let’s talk about Natural Language Processing.
- Seven Myths About the True Costs of AI Systems
- Oct 24, 2019.
While there is much excitement today around implementing AI at the enterprise level, the financial costs of this process are often unexpected and underappreciated. These seven myths are crucial lessons learned that executives should know before heading down the road to AI.
- Feature Selection: Beyond feature importance?
- Oct 24, 2019.
In this post, you will see 3 different techniques of how to do Feature Selection to your datasets and how to build an effective predictive model.
- Convolutional Neural Network for Breast Cancer Classification
- Oct 24, 2019.
See how Deep Learning can help in solving one of the most commonly diagnosed cancer in women.
- Top KDnuggets tweets, Oct 16-22: How YouTube is Recommending Your Next Video
- Oct 23, 2019.
Also: The 5 Classification Evaluation Metrics Every Data Scientist Must Know; How to Recognize a Good Data Scientist Job From a Bad One; How to Easily Deploy Machine Learning Models Using Flask.
- Samsung Tech Day: Today’s Electronic Devices Seem Magical, But the Real Super-Power is in Silicon
- Oct 23, 2019.
Samsung’s Tech Day event showcases processor and memory advances for 5G, AI, Cloud and Edge Computing, Automotive, IoT, and more.
- Intro to Adversarial Machine Learning and Generative Adversarial Networks
- Oct 23, 2019.
In this crash course on GANs, we explore where they fit into the pantheon of generative models, how they've changed over time, and what the future has in store for this area of machine learning.
- How to Measure Foot Traffic Using Data Analytics
- Oct 23, 2019.
You need to know how many people visit your store now and what sort of audience you're acquiring. Foot traffic data is going to be invaluable to the success of your business.
- Time Series Analysis: A Simple Example with KNIME and Spark
- Oct 23, 2019.
The task: train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi dataset.
-
Everything a Data Scientist Should Know About Data Management - Oct 22, 2019.
For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions. - Addressing the Growing Need for Skills in Data Science
- Oct 22, 2019.
To address the current difficulties in hiring data scientists due to their short supply, many companies can benefit from retraining existing analytically minded employees.
- Bye Data Scientists, Hello AI? Not Likely!
- Oct 22, 2019.
AI is becoming more mainstream. The fact that computers/robots will learn after being built and will surpass a human's intelligence is terrifying.
- How to Write Web Apps Using Simple Python for Data Scientists
- Oct 22, 2019.
Convert your Data Science Projects into cool apps easily without knowing any web frameworks.
- Anomaly Detection, A Key Task for AI and Machine Learning, Explained
- Oct 21, 2019.
One way to process data faster and more efficiently is to detect abnormal events, changes or shifts in datasets. Anomaly detection refers to identification of items or events that do not conform to an expected pattern or to other items in a dataset that are usually undetectable by a human expert.
-
How YouTube is Recommending Your Next Video - Oct 21, 2019.
If you are interested in learning more about the latest Youtube recommendation algorithm paper, read this post for details on its approach and improvements. - Top Stories, Oct 14-20: How to Become a (Good) Data Scientist Beginner Guide
- Oct 21, 2019.
Also: The 5 Classification Evaluation Metrics Every Data Scientist Must Know; Artificial Intelligence: Salaries Heading Skyward; Writing Your First Neural Net in Less Than 30 Lines of Code with Keras; How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat; The Last SQL Guide for Data Analysis You'll Ever Need
- This Microsoft Neural Network can Answer Questions About Scenic Images with Minimum Training
- Oct 21, 2019.
Recently, a group of AI experts from Microsoft Research published a paper proposing a method for scene understanding that combines two key tasks: image captioning and visual question answering (VQA).
- How to Get the Most out of ODSC West 2019
- Oct 18, 2019.
ODSC West comes to San Francisco on Oct 29 - Nov 1. With over 300 hours of content, 200+ speakers, and thousands of attendees, there is certainly a lot to see, learn, and do at the conference. Register by Friday for 10% off your pass.
- 5 Tips for Novice Freelance Data Scientists
- Oct 18, 2019.
If you want to launch your data science skills into freelance work, then check out these important tips to help you kick start your next adventure in data.
- Building an intelligent Digital Assistant
- Oct 18, 2019.
In this second part we want to outline our own experience building an AI application and reflect on why we chose not to utilise deep learning as the core technology used.
- Writing Your First Neural Net in Less Than 30 Lines of Code with Keras
- Oct 18, 2019.
Read this quick overview of neural networks and learn how to implement your first in very few lines using Keras.
- Real Data, Big Impact: UChicago Students Work to Improve Sales at Goose Island
- Oct 17, 2019.
Watch UChicago Master of Science in Analytics capstone projects unfold in Real Data, Big Impact and see how students collaborate with their clients to deliver successful analytics projects.
- Data Anonymization – History and Key Ideas
- Oct 17, 2019.
While effective anonymization technology remains elusive, understanding the history of this challenge can guide data science practitioners to address these important concerns through ethical and responsible use of sensitive information.
- Artificial Intelligence: Salaries Heading Skyward
- Oct 17, 2019.
While the average salary for a Software Engineer is around $100,000 to $150,000, to make the big bucks you want to be an AI or Machine Learning (Specialist/Scientist/Engineer.)
- How to Easily Deploy Machine Learning Models Using Flask
- Oct 17, 2019.
This post aims to make you get started with putting your trained machine learning models into production using Flask API.
- Top KDnuggets tweets, Oct 09-15: #DeepLearning for Natural Language Processing (#NLP) using RNNs & CNNs #KDN Post
- Oct 16, 2019.
Also: Kannada-MNIST: A new handwritten digits dataset in ML town; Math for Programmers; The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral Data Visualization; The Last SQL Guide for Data Analysis You’ll Ever Need
-
How to Become a (Good) Data Scientist – Beginner Guide - Oct 16, 2019.
A guide covering the things you should learn to become a data scientist, including the basics of business intelligence, statistics, programming, and machine learning. - Probability Learning: Bayes’ Theorem
- Oct 16, 2019.
Learn about one of the fundamental theorems of probability with an easy everyday example.
- The 5 Classification Evaluation Metrics Every Data Scientist Must Know
- Oct 16, 2019.
This post is about various evaluation metrics and how and when to use them.
- Automated Data Governance 101
- Oct 15, 2019.
The way we control our data isn’t working. Data is as vulnerable as ever. Download this white paper, which outlines lessons about how data science and governance programs can, if implemented properly, reinforce each other’s objective.
- Using DC/OS to Accelerate Data Science in the Enterprise
- Oct 15, 2019.
Follow this step-by-step tutorial using Tensorflow to setup a DC/OS Data Science Engine as a PaaS for enabling distributed multi-node, multi-GPU model training.
- Top 7 Things I Learned in my Data Science Masters
- Oct 15, 2019.
Even though I’m still in my studies, here’s a list of the most important things I’ve learned (as of yet).
- Research Guide for Video Frame Interpolation with Deep Learning
- Oct 15, 2019.
In this research guide, we’ll look at deep learning papers aimed at synthesizing video frames within an existing video.
- Three Things to Know About Reinforcement Learning
- Oct 14, 2019.
As an engineer, scientist, or researcher, you may want to take advantage of this new and growing technology, but where do you start? The best place to begin is to understand what the concept is, how to implement it, and whether it’s the right approach for a given problem.
- Choosing a Machine Learning Model
- Oct 14, 2019.
Selecting the perfect machine learning model is part art and part science. Learn how to review multiple models and pick the best in both competitive and real-world applications.
- Using Neural Networks to Design Neural Networks: The Definitive Guide to Understand Neural Architecture Search
- Oct 14, 2019.
A recent survey outlined the main neural architecture search methods used to automate the design of deep learning systems.
- Top Stories, Oct 7-13: 10 Free Top Notch Natural Language Processing Courses; The Last SQL Guide for Data Analysis You’ll Ever Need
- Oct 14, 2019.
Also: Activation maps for deep learning models in a few lines of code; The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral Data Visualization; OpenAI Tried to Train AI Agents to Play Hide-And-Seek but Instead They Were Shocked by What They Learned; 10 Great Python Resources for Aspiring Data Scientists
- An Overview of Density Estimation
- Oct 14, 2019.
Density estimation is estimating the probability density function of the population from the sample. This post examines and compares a number of approaches to density estimation.
- Upcoming Webinar, Machine Learning Vital Signs: Metrics and Monitoring Models in Production
- Oct 11, 2019.
In this upcoming webinar on Oct 23 @ 10 AM PT, learn why you should invest time in monitoring your machine learning models, the dangers of not paying attention to how a model’s performance can change over time, metrics you should be gathering for each model and what they tell you, and much more.
- Beyond Word Embedding: Key Ideas in Document Embedding
- Oct 11, 2019.
This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.
- There is No Such Thing as a Free Lunch
- Oct 11, 2019.
You have heard the expression “there is no such thing as a free lunch” – well in machine learning the same principle holds. In fact there is even a theorem with the same name.
- The problem with metrics is a big problem for AI
- Oct 11, 2019.
The practice of optimizing metrics is not new nor unique to AI, yet AI can be particularly efficient (even too efficient!) at doing so.
- An ODSC West Guide to the Most Important Topics in Data Science Right Now
- Oct 10, 2019.
In this article, we’ll outline just a few of the most important topics in data science that our speakers will be presenting on at ODSC West Oct 29 - Nov 1 in San Francisco.
- 8 Paths to Getting a Machine Learning Job Interview
- Oct 10, 2019.
While you may be focused on your performance during your next job interview, landing that interview can be just as hard. Check out these tips for finding and securing an interview for a machine learning job.
- Lemma, Lemma, Red Pyjama: Or, doing words with AI
- Oct 10, 2019.
If we want a machine learning model to be able to generalize these forms together, we need to map them to a shared representation. But when are two different words the same for our purposes? It depends.
-
Activation maps for deep learning models in a few lines of code - Oct 10, 2019.
We illustrate how to show the activation maps of various layers in a deep CNN model with just a couple of lines of code. - Top KDnuggets tweets, Oct 02-08: Turn #Python Scripts into Beautiful ML Tools – with Streamlit, an app framework built for #MachineLearning engineers
- Oct 9, 2019.
Also: 12 things I wish I'd known before starting as a Data Scientist; 10 Free Top Notch Natural Language Processing Courses; The Last SQL Guide for Data Analysis; The 4 Quadrants of #DataScience Skills and 7 Principles for Creating a Viral DataViz.
- Math for Programmers..
- Oct 9, 2019.
Math for Programmers teaches you the math you need to know for a career in programming, concentrating on what you need to know as a developer.
- Four questions to help accurately scope analytics engineering project
- Oct 9, 2019.
Being really good at scoping analytics projects is crucial for team productivity and profitability. You can consistently deliver on time if you work out the issue first, and these four questions can help you prepare.
- Contributing to PyTorch: By someone who doesn’t know a ton about PyTorch
- Oct 9, 2019.
By the end of my week with the team, I managed to proudly cut two PRs on GitHub. I decided that I would write a blog post to knowledge share, not just to show that YES, you can too.
- Data Science is Boring (Part 2)
- Oct 9, 2019.
Why I love boring ML problems and how I think about them.
- Top September Stories: I wasn’t getting hired as a Data Scientist. So I sought data on who is.
- Oct 8, 2019.
Also: 10 Great Python Resources for Aspiring Data Scientists; Python Libraries for Interpretable Machine Learning.
- The countdown is on – 2 weeks to Predictive Analytics World London
- Oct 8, 2019.
At Predictive Analytics World London, 16-17 Oct, you'll discover topics tailored for your needs, whether you're an expert practitioner or a newcomer. Use the code KDNUGGETS for a 15% discount on your Predictive Analytics World ticket.
- Why the ‘why way’ is the right way to restoring trust in AI
- Oct 8, 2019.
As so many more organizations now rely on AI to deliver services and consumer experiences, establishing a public trust in the AI is crucial as these systems begin to make harder decisions that impact customers.
- Introduction to Artificial Neural Networks
- Oct 8, 2019.
In this article, we’ll try to cover everything related to Artificial Neural Networks or ANN.
- Know Your Data: Part 2
- Oct 8, 2019.
To build an effective learning model, it is must to understand the quality issues exist in data & how to detect and deal with it. In general, data quality issues are categories in four major sets.
- Math in Our Lives video collection from SIAM
- Oct 7, 2019.
Having trouble explaining why applied math matters to your non-specialist friends and colleagues? As valued members of the applied math community and ambassadors of SIAM, review these short animations and share them with your interested networks! Help us show that math matters and why.
-
The 4 Quadrants of Data Science Skills and 7 Principles for Creating a Viral Data Visualization - Oct 7, 2019.
As a data scientist, your most important skill is creating meaningful visualizations to disseminate knowledge and impact your organization or client. These seven principals will guide you toward developing charts with clarity, as exemplified with data from a recent KDnuggets poll. - Top Stories, Sep 30 – Oct 6: The Last SQL Guide for Data Analysis You’ll Ever Need; Know Your Data: Part 1
- Oct 7, 2019.
Also: How AI will transform healthcare (and can it fix the US healthcare system?); Choosing the Right Clustering Algorithm for your Dataset; DeepMind Has Quietly Open Sourced Three New Impressive Reinforcement Learning Frameworks; A European Approach to Masters Degrees in Data Science; The Future of Analytics and Data Science
- OpenAI Tried to Train AI Agents to Play Hide-And-Seek but Instead They Were Shocked by What They Learned
- Oct 7, 2019.
OpenAI trained agents in a simple game of hide-and-seek and learned many other different skills in the process.
-
10 Free Top Notch Natural Language Processing Courses - Oct 7, 2019.
Are you looking to learn natural language processing? This collection of 10 free top notch courses will allow you to do just that, with something for every approach to learning NLP and its varied topics. - Overcoming Deep Learning Stumbling Blocks
- Oct 4, 2019.
Find out what was presented at the 6th annual Deep Learning Summit in London where industry leaders, academics, researchers, and innovative startups presenting the latest technological advancements and industry application methods in the field of deep learning.
-
The Last SQL Guide for Data Analysis You’ll Ever Need - Oct 4, 2019.
This is it: the last SQL guide for data analysis you'll ever need! OK, maybe it’s actually the first. But it’ll give you a solid head start. - Research Guide for Neural Architecture Search
- Oct 4, 2019.
In this guide, we will explore a range of research papers that have sought to solve the challenging task of automating neural network design.
- 6 Must See Deep Learning Experts at ODSC West 2019 – 20% Off Ends Friday
- Oct 3, 2019.
You won’t want to miss the opportunity to learn about the future of deep learning first-hand at ODSC West in San Francisco, Oct 29 - Nov 1. So don’t forget to register soon for 20% off.
- 5 Fundamental AI Principles
- Oct 3, 2019.
While AI may appear magical at times, these five principles will help guide you to avoid pitfalls when leveraging this tech.
- Recreating Imagination: DeepMind Builds Neural Networks that Spontaneously Replay Past Experiences
- Oct 3, 2019.
DeepMind researchers created a model to be able to replay past experiences in a way that simulate the mechanisms in the hippocampus.
- Training a Machine Learning Engineer
- Oct 3, 2019.
There is no clear outline on how to study Machine Learning/Deep Learning due to which many individuals apply all the possible algorithms that they have heard of and hope that one of implemented algorithms work for their problem in hand. Below, I've listed out some of the steps that one should adopt while solving a machine learning problem.
- Top KDnuggets tweets, Sep 25 – Oct 01: Natural Language in Python using spaCy: An Introduction
- Oct 2, 2019.
Also: Top KDnuggets tweets, Sep 18-24: Python Libraries for Interpretable Machine Learning; Scikit-Learn: A silver bullet for basic ML; Automatic Version Control for Data Scientists; My journey path from a Software Engineer to BI Specialist to a Data Scientist
- Statistical Thinking for Industrial Problem Solving: a free online course
- Oct 2, 2019.
This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.
-
Choosing the Right Clustering Algorithm for your Dataset - Oct 2, 2019.
Applying a clustering algorithm is much easier than selecting the best one. Each type offers pros and cons that must be considered if you’re striving for a tidy cluster structure. - Data Preparation for Machine learning 101: Why it’s important and how to do it
- Oct 2, 2019.
As data scientists who are the brains behind the AI-based innovations, you need to understand the significance of data preparation to achieve the desired level of cognitive capability for your models. Let’s begin.
- Multi-Task Learning – ERNIE 2.0: State-of-the-Art NLP Architecture Intuitively Explained
- Oct 2, 2019.
The tech giant Baidu unveiled its state-of-the-art NLP architecture ERNIE 2.0 earlier this year, which scored significantly higher than XLNet and BERT on all tasks in the GLUE benchmark. This major breakthrough in NLP takes advantage of a new innovation called “Continual Incremental Multi-Task Learning”.
-
A European Approach to Master’s Degrees in Data Science - Oct 1, 2019.
Data science education in Europe has been reevaluated and new recommendations are leading the way to the next generation of data science Master's courses to better support and train students. - Sentiment and Emotion Analysis for Beginners: Types and Challenges
- Oct 1, 2019.
There are three types of emotion AI, and their combinations. In this article, I’ll briefly go through these three types and the challenges of their real-life applications.
- Clustering Metrics Better Than the Elbow Method
- Oct 1, 2019.
We show what metric to use for visualizing and determining an optimal number of clusters much better than the usual practice — elbow method.