2018 Jun
All (96) | Courses, Education (6) | Meetings (8) | News, Features (9) | Opinions, Interviews (24) | Top Stories, Tweets (8) | Tutorials, Overviews (37) | Webcasts & Webinars (4)
- KNIME Fall Summit in Austin, November 6-9, 2018 registrations now open!
- Jun 29, 2018.
KNIME Fall Summit takes place Nov 6-9 in Austin, Texas. Registration is now open, and KDnuggets readers save 10% on top of early bird rates with code KDNUGGETS!
- Modern Graph Query Language – GSQL
- Jun 29, 2018.
This post introduces the prospect of fulfilling the need for a modern graph query language with GSQL
- Inside the Mind of a Neural Network with Interactive Code in Tensorflow
- Jun 29, 2018.
Understand the inner workings of neural network models as this post covers three related topics: histogram of weights, visualizing the activation of neurons, and interior / integral gradients.
- Building a Basic Keras Neural Network Sequential Model
- Jun 29, 2018.
The approach basically coincides with Chollet's Keras 4 step workflow, which he outlines in his book "Deep Learning with Python," using the MNIST dataset, and the model built is a Sequential network of Dense layers. A building block for additional posts.
- Las Vegas Data Innovation Summits
- Jun 28, 2018.
We're bringing together 200+ leaders from the data & analytics industry for you to network, learn and to discuss the latest trends, topics & opportunities. Use code KD300 to save.
- Announcing Microsoft Research Open Data, a cloud hosted platform for sharing datasets
- Jun 28, 2018.
Microsoft announces Microsoft Research Open Data, datasets representing many years of data curation and research efforts by Microsoft that were published as research outcomes.
- Using Topological Data Analysis to Understand the Behavior of Convolutional Neural Networks
- Jun 28, 2018.
Neural Networks are powerful but complex and opaque tools. Using Topological Data Analysis, we can describe the functioning and learning of a convolutional neural network in a compact and understandable way.
- What’s the Difference Between Data Integration and Data Engineering?
- Jun 28, 2018.
Why is this distinction important? Because it’s critical to understanding how leading-organizations are investing in new data engineering skills that exploit advanced analytics to create new sources of business and operational value.
- Choosing Between Modern Data Warehouses
- Jun 28, 2018.
Most of the modern data warehouse solutions are designed to work with raw data. It allows to re-transform data on the fly without a need to re-ingest your data stored in a warehouse.
- Top KDnuggets tweets, Jun 20-26: Detecting Sarcasm with Deep Convolutional Neural Networks
- Jun 27, 2018.
Also Cartoon: FIFA World Cup Football and Machine Learning; What is it like to be a #MachineLearning engineer in 2018? The 5 Clustering Algorithms Data Scientists Need to Know.
- [ebook] Apache Spark™ Under the Hood
- Jun 27, 2018.
Learn how to install and run Spark yourself; A summary of Spark core architecture and concepts; Spark powerful language APIs and how you can use them.
- Analyzing Personalization Results
- Jun 27, 2018.
The 4th part of this series will help answer the following questions: “Should I improve something or make changes to the system? Can it work more effectively? Can I squeeze the lion’s share of it?”
-
Top 20 Python Libraries for Data Science in 2018 - Jun 27, 2018.
Our selection actually contains more than 20 libraries, as some of them are alternatives to each other and solve the same problem. Therefore we have grouped them as it's difficult to distinguish one particular leader at the moment. - Explaining Reinforcement Learning: Active vs Passive
- Jun 26, 2018.
We examine the required elements to solve an RL problem, compare passive and active reinforcement learning, and review common active and passive RL techniques.
- Keynote announced for Predictive Analytics World for Government – 18-19 Sept in Washington, DC
- Jun 26, 2018.
Join hundreds of your peers at PAW Government, 18-19 Sep in Washington, DC, and learn how goverment agencies are using predictive analytics and AI to optimize operations and reduce costs.
-
5 Data Science Projects That Will Get You Hired in 2018 - Jun 26, 2018.
A portfolio of real-world projects is the best way to break into data science. This article highlights the 5 types of projects that will help land you a job and improve your career. - Why Data Scientists Love Gaussian
- Jun 26, 2018.
Gaussian distribution model, often identified with its iconic bell shaped curve, also referred as Normal distribution, is so popular mainly because of three reasons.
- Batch Normalization in Neural Networks
- Jun 26, 2018.
This article explains batch normalization in a simple way. I wrote this article after what I learned from Fast.ai and deeplearning.ai.
- Introducing WSO2 Stream Processor
- Jun 25, 2018.
WSO2 Stream Processor is an open source, lightweight, Streaming SQL based platform that enables you to do running aggregations, to detect patterns, and to generate alerts on data streams in real-time.
- Stagraph – a general purpose R GUI, for data import, wrangling, and visualization
- Jun 25, 2018.
Stagraph is a new simple visual interface for R, which focuses on data import, data wrangling and data visualization.
- How to Execute R and Python in SQL Server with Machine Learning Services
- Jun 25, 2018.
Machine Learning Services in SQL Server eliminates the need for data movement - you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
- Top Stories, Jun 18-24: Data Lake – the evolution of data processing; Detecting Sarcasm with Deep Convolutional Neural Networks
- Jun 25, 2018.
Also: What is it like to be a machine learning engineer in 2018?; 7 Simple Data Visualizations You Should Know in R; Choosing the Right Metric for Evaluating Machine Learning Models - Part 2; Data Lake - the evolution of data processing
-
30 Free Resources for Machine Learning, Deep Learning, NLP & AI - Jun 25, 2018.
Check out this collection of 30 ML, DL, NLP & AI resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next. - Why the Data Lake Matters
- Jun 22, 2018.
This post outlines why the data lake matters, outlining the complexity of a data lake and taking a look at its evolution over time.
-
7 Simple Data Visualizations You Should Know in R - Jun 22, 2018.
This post presents a selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages. - Simple Tips for PostgreSQL Query Optimization
- Jun 22, 2018.
A single query optimization tip can boost your database performance by 100x. Although we usually advise our customers to use these tips to optimize analytic queries (such as aggregation ones), this post is still very helpful for any other type of query.
-
What is it like to be a machine learning engineer in 2018? - Jun 21, 2018.
A personal account as to why 2018 is going to be a fun year for machine learning engineers. - An Intuitive Introduction to Gradient Descent
- Jun 21, 2018.
This post provides a good introduction to Gradient Descent, covering the intuition, variants and choosing the learning rate.
-
Detecting Sarcasm with Deep Convolutional Neural Networks - Jun 21, 2018.
Detection of sarcasm is important in other areas such as affective computing and sentiment analysis because such expressions can flip the polarity of a sentence. - Deep Learning Best Practices – Weight Initialization
- Jun 21, 2018.
In this blog I am going to talk about the issues related to initialization of weight matrices and ways to mitigate them. Before that, let’s just cover some basics and notations that we will be using going forward.
- Top KDnuggets tweets, Jun 6–19: #MachineLearning predicts #WorldCup2018 winner; 10 More Free Must-Read Books for Data Science
- Jun 20, 2018.
Also: Google #AI principles; #Cartoon: FIFA #WorldCup #Football and #MachineLearning; Introduction to Game Theory; Top 20 Recent Research Papers on Machine Learning and Deep Learning
- Technical Content Personalization
- Jun 20, 2018.
Part 3 of this series moves on from segmenting audiences to the technological side of the process.
-
The 5 Clustering Algorithms Data Scientists Need to Know - Jun 20, 2018.
Today, we’re going to look at 5 popular clustering algorithms that data scientists need to know and their pros and cons! - Get Packt Skill Up Developer Skills Report
- Jun 19, 2018.
Find the top tools for 4 distinct industries, learn what do developers in different sectors say is the next big thing, and more. Also get any Packt book or video for just $10.
- 5 Key Takeaways from Strata London 2018
- Jun 19, 2018.
5 highlights and thoughts from my attendance to Strata London 2018.
- Data Science Predicting The Future
- Jun 19, 2018.
In this article we will expand on the knowledge learnt from the last article - The What, Where and How of Data for Data Science - and consider how data science is applied to predict the future.
- Choosing the Right Metric for Evaluating Machine Learning Models — Part 2
- Jun 19, 2018.
This will focus on commonly used metrics in classification, why should we prefer some over others with context.
- Natural Language Processing Nuggets: Getting Started with NLP
- Jun 19, 2018.
Check out this collection of NLP resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
- Drexel Online MS in Data Science
- Jun 18, 2018.
With emphasis Data Science and algorithm creation skills, you’ll graduate workplace-ready by having experience with the industry leading technology.
- Every time someone runs a correlation coefficient on two time series, an angel loses their wings
- Jun 18, 2018.
We all know correlation doesn’t equal causality at this point, but when working with time series data, correlation can lead you to come to the wrong conclusion.
- Top Stories, Jun 11-17: Data Lake – the evolution of data processing; Generating Text with RNNs in 4 Lines of Code
- Jun 18, 2018.
Also: Cartoon: 5 Machine Learning Projects You Should Not Overlook, June 2018; FIFA World Cup Football and Machine Learning; The What, Where and How of Data for Data Science; Data Lake the evolution of data processing
- Step Forward Feature Selection: A Practical Example in Python
- Jun 18, 2018.
When it comes to disciplined approaches to feature selection, wrapper methods are those which marry the feature selection process to the type of model being built, evaluating feature subsets in order to detect the model performance between features, and subsequently select the best performing subset.
-
Cartoon: FIFA World Cup Football and Machine Learning - Jun 16, 2018.
In honor of 2018 FIFA World Cup in Football, we update our classic KDnuggets cartoon - what players can do when their moves are predicted by Machine Learning? - Train your team in latest in Analytics, Data, Machine Learning, and Strategy
- Jun 15, 2018.
TDWI Anaheim takes place Aug 5-10. Register by Super Early Bird Deadline on Jun 22 to save up to $915 with priority code KD20. Teams save an EXTRA 10%. Register now!
- How should I organize a larger data science team?
- Jun 15, 2018.
VP of Data Science is asking opinions on how should he organize a larger Data Science team.
- How to spot a beginner Data Scientist
- Jun 15, 2018.
When beginning life as a data scientist, there are some clear signs that give it away...
- IoT on AWS: Machine Learning Models and Dashboards from Sensor Data
- Jun 15, 2018.
I developed my first IoT project using my notebook as an IoT device and AWS IoT as infrastructure, with this "simple" idea: collect CPU Temperature from my Notebook running on Ubuntu, send to Amazon AWS IoT, save data, make it available for Machine Learning models and dashboards.
- Statistics, Causality, and What Claims are Difficult to Swallow: Judea Pearl debates Kevin Gray
- Jun 15, 2018.
While KDnuggets takes no side, we present the informative and respectful back and forth as we believe it has value for our readers. We hope that you agree.
- 3 Key Capabilities for Real-Time Data and Analysis Today, June 28 Webinar
- Jun 14, 2018.
To educate IT and business stakeholders about the three key capabilities for succeeding with real-time analytics today.
- Build an Anomaly Detection Project [Free Guidebook]
- Jun 14, 2018.
Learn how to find value and insight in outliers in the latest anomaly detection guidebook by Dataiku, which includes use cases, and step-by-step guidance (including code samples) to starting an anomaly detection project.
-
Data Lake – the evolution of data processing - Jun 14, 2018.
This post examines the evolution of data processing in data lakes, with a particular focus on the concepts, architecture and technology criteria behind them. - Taming LSTMs: Variable-sized mini-batches and why PyTorch is good for your health
- Jun 14, 2018.
After reading this, you’ll be back to fantasies of you + PyTorch eloping into the sunset while your Recurrent Networks achieve new accuracies you’ve only read about on Arxiv.
-
Generating Text with RNNs in 4 Lines of Code - Jun 14, 2018.
Want to generate text with little trouble, and without building and tuning a neural network yourself? Let's check out a project which allows you to "easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code." - Advice For Applying To Data Science Jobs
- Jun 13, 2018.
A comprehensive guide to applying for a job in data science, covering the application, interview and offer stage.
- How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning
- Jun 13, 2018.
An end-to-end example of how to build a system that can search objects semantically.
- ebook: A Guide to Data Science at Scale
- Jun 12, 2018.
Read our eBook to learn how easy it is to build and scale ML models with a unified analytics platform, how to collaborate across data teams to uncover insights faster, and more. Free download.
- Which Data Profession Has The Highest Job Satisfaction?
- Jun 12, 2018.
KDnuggets poll compares Machine Learning Engineer, Researcher, Data Scientist and other professions and identifies one with the highest job satisfaction. Job satisfaction usually starts high, but drops significantly after 4 years on the job.
-
The What, Where and How of Data for Data Science - Jun 12, 2018.
Here we will take data science apart and build it back up to a coherent and manageable concept. Bear with us! - A Better Stats 101
- Jun 12, 2018.
Statistics encourages us to think systemically and recognize that variables normally do not operate in isolation, and that an effect usually has multiple causes. Some call this multivariate thinking. Statistics is particularly useful for uncovering the Why.
-
5 Machine Learning Projects You Should Not Overlook, June 2018 - Jun 12, 2018.
Here is a new installment of 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out! - Empowering National Grid with Anaconda Enterprise
- Jun 11, 2018.
With Anaconda Enterprise, National Grid was able to implement a more informed and cost-effective system that allowed for greater accuracy in modeling and predicting maintenance needs. Read the case study to learn more.
- Online Data Science Education and Certificates from Statistics.com
- Jun 11, 2018.
Apply online to The Institute for Statistics Education, the pioneer in online data science education. You can begin your online certificate right now - we offer rolling admission and introductory classes every month. Get started today!
- Top May Stories: Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018; Data Science vs Machine Learning vs Data Analytics vs Business Analytics
- Jun 11, 2018.
Also: Boost your data science skills. Learn linear algebra. 10 More Free Must-Read Books for Machine Learning and Data Science.
- Why you need to improve your training data, and how to do it
- Jun 11, 2018.
This article examines the way you need to improve your training data and how it can be accomplished, including speech commands, choosing the right data, picking a model fast and more.
- Top Stories, Jun 4-10: Did Python declare victory over R?; The Keras 4 Step Workflow
- Jun 11, 2018.
Also: Introduction to Game Theory (Part 1); Human Interpretable Machine Learning (Part 1) - The Need and Importance of Model Interpretation; DIY Deep Learning Projects; 10 More Free Must-Read Books for Machine Learning and Data Science
- Packaging and Distributing Your Python Project to PyPI for Installation Using pip
- Jun 11, 2018.
This tutorial will explain the steps required to package your Python projects, distribute them in distribution formats using steptools, upload them into the Python Package Index (PyPI) repository using twine, and finally installation using Python installers such as pip and conda.
- Special KDnuggets offer for Strata NY
- Jun 8, 2018.
See some of data’s most fascinating people, from data’s most successful companies, talking about data’s most intriguing problems, at Strata Data Conference in New York, Sep 11-13. Save an additional 20% with code KDNU.
-
DIY Deep Learning Projects - Jun 8, 2018.
Inspired by the great work of Akshay Bahadur in this article you will see some projects applying Computer Vision and Deep Learning, with implementations and details so you can reproduce them on your computer. - Drexel New Online MS in Business Analytics
- Jun 7, 2018.
With Drexel University’s online MS in Business Analytics program, you’ll be able to effectively analyze this overlooked data to give your company and yourself a competitive edge.
- How (dis)similar are my train and test data?
- Jun 7, 2018.
This articles examines a scenario where your machine learning model can fail.
-
Netflix Data Science Interview Questions: Acing the AI Interview - Jun 7, 2018.
Gain some perspective on the Netflix interview process, and on ways to prepare for just such an industry interview. - Command Line Tricks For Data Scientists
- Jun 7, 2018.
Aspiring to master the command line should be on every developer’s list, especially data scientists. Learning the ins and outs of your terminal will undeniably make you more productive.
- Top KDnuggets tweets, May 30 – Jun 5: Notes from Coursera #DeepLearning courses by Andrew Ng; Finland offers free online #AI course
- Jun 6, 2018.
Also: Who Is Going To Make Money In #AI?; Generative Adversarial Networks (GANs) in 50 lines of code (PyTorch); Learning from Imbalanced Classes; 10 More Free Must-Read Books for Machine Learning and Data Science
- Best Practices in Data Visualization, continued
- Jun 6, 2018.
Do your data visualizations need a reboot? Though data visualizations may be designed to facilitate understanding, not all graphs are effective. In this webcast, viewers will learn how to use best practices to give a graph a makeover.
-
The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data. - Learn Business Analytics at Clark University – affordable excellence
- Jun 6, 2018.
Move your career forward in one of the fields with the largest demand. Business Analytics at Clark University will give you the skills employers demand by teaching you how to synthesize data into powerful information.
- The Statistics of Gang Violence
- Jun 6, 2018.
For Carlos Carcach, Professor & Director, Center for Public Policy at the Escuela Superior de Economía y Negocios (ESEN) in Santa Tecla, El Salvador, gangs are an object of intellectual curiosity and the subject of his research.
- Audience Segmentation
- Jun 6, 2018.
The process of audience segmentation is not about just statistics, it’s about finding your ideal clients and choosing the right way of interaction with them.
- Introduction to Game Theory (Part 1)
- Jun 6, 2018.
Check out this game theory basics post for an introduction to Two-player Sequential games — Dominant Strategies, Nash Equilibrium, and Cooperation vs. Defection.
- Human Interpretable Machine Learning (Part 1) — The Need and Importance of Model Interpretation
- Jun 6, 2018.
A brief introduction into machine learning model interpretation.
- NYU Stern MS in Business Analytics
- Jun 5, 2018.
NYU Stern MS in Business Analytics provides experienced professionals with a unique and valuable data-driven business perspective. This 1 year, part-time program is divided into 5 onsite modules with online independent study in between. Apply now.
-
Football World Cup 2018 Predictions: Germany vs Brazil in the final, and more - Jun 5, 2018.
Looking ahead to the FIFA World Cup that kicks off this month (14th June), we have created the official KDnuggets predictions. - Europe Data Science Conference – KDnuggets Offer – ends Fri, 8 June
- Jun 5, 2018.
Register for ODSC Europe 2018, leading Data Science conference in Europe - best rate ends June 8. Use code KDNuggets to save.
- ioModel Machine Learning Research Platform – Open Source
- Jun 5, 2018.
This article introduces ioModel, an open source research platform that ingests data and automatically generates descriptive statistics on that data.
- Three techniques to improve machine learning model performance with imbalanced datasets
- Jun 5, 2018.
The primary objective of this project was to handle data imbalance issue. In the following subsections, I describe three techniques I used to overcome the data imbalance problem.
- Gain a New Perspective on Your Customer Data at Wharton
- Jun 4, 2018.
Are you using your customer data to its full advantage? Chances are the answer is no. Customer Analytics from Wharton Executive Education, Sep. 17–21, 2018, Philadelphia, gives you a deeper, actionable understanding of your data.
- Big Data Toronto Brings Canada to the Centre Stage in Big Data and AI
- Jun 4, 2018.
The Big Data Toronto conference and expo is back for its 3rd edition on Jun 12-13, 2018 at the Metro Toronto Convention Centre. Big Data focuses on the skills, software and leadership needed to implement data insights & AI Toronto is dedicated to Toronto’s growing AI and deep learning communities.
- Resources For Women In Data Science and Machine Learning
- Jun 4, 2018.
A comprehensive list of resources for Women in Data Science and Machine Learning, including a list of useful tech groups and published lists for finding Women speakers.
-
The Keras 4 Step Workflow - Jun 4, 2018.
In his book "Deep Learning with Python," Francois Chollet outlines a process for developing neural networks with Keras in 4 steps. Let's take a look at this process with a simple example. - Top Stories, May 28 – Jun 3: 10 More Free Must-Read Books for Machine Learning and Data Science; A Beginners Guide to the Data Science Pipeline
- Jun 4, 2018.
Also: Descriptive analytics, machine learning, and deep learning viewed via the lens of CRISP-DM; On the contribution of neural networks and word embeddings in NLP; Improving the Performance of a Neural Network; Python eats away at R
- Top 5 Priorities for an Analytics Leader, June 7 Webinar
- Jun 1, 2018.
Learn how an analytics leader enables results with priorities that include evangelizing the importance of data-driven decision-making; aligning analytics with a business value-driven approach; and developing an analytics competency to train and develop staff.
- Upcoming Meetings in AI, Analytics, Big Data, Data Science, Deep Learning, Machine Learning: June 2018 and Beyond
- Jun 1, 2018.
Coming soon: Mega-PAW Las Vegas, Spark + AI Summit SF, CogX London, Big Data Toronto Big Data Toronto Conference and Expo, ICDM/MLDM NYC, and many more.
- How To Build Intelligent Dashboards Powered by Machine Learning
- Jun 1, 2018.
In this webinar on Jun 5, 1:00 pm ET, analytics industry expert Jen Underwood will demonstrate how to visualize machine learning results with dashboard tools.
- The Future of Artificial Intelligence: Is Your Job Under Threat?
- Jun 1, 2018.
This article examines the rapid growth of artificial intelligence: how we got to this point, the future AI job market and what can be done.
- The Book of Why
- Jun 1, 2018.
Judea Pearl has made noteworthy contributions to artificial intelligence, Bayesian networks, and causal analysis. These achievements notwithstanding, Pearl holds some views many statisticians may find odd or exaggerated.
- Using Linear Regression for Predictive Modeling in R
- Jun 1, 2018.
In this post, we’ll use linear regression to build a model that predicts cherry tree volume from metrics that are much easier for folks who study trees to measure.