Game Theory Reveals the Future of Deep Learning - Dec 29, 2016.
This post covers the emergence of Game Theoretic concepts in the design of newer deep learning architectures. Deep learning systems need to be adaptive to imperfect knowledge and coordinating systems, 2 areas with which game theory can help.
Architecture, Deep Learning, Optimization
- Laying the Foundation for a Data Team - Dec 28, 2016.
Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.
Analytics Team, Data Science Team, Team
A Funny Look at Big Data and Data Science - Dec 27, 2016.
A less than serious look at Big Data and Data Science. If you can laugh at all cartoons, then your Data Science skills are in good shape.
Big Data, Cartoon, Humor, SQL
- The Five Capability Levels of Deep Learning Intelligence - Dec 22, 2016.
Deep learning writer Carlos Perez gives his own classification for deep learning-based AI, which is aimed at practitioners. This classification gives us a sense of where we currently are and where we might be heading.
AI, Deep Learning, Machine Intelligence
- The big data ecosystem for science: Climate Science and Climate Change - Dec 22, 2016.
Climate change is one of the most pressing challenges for human society in the 21st century. We review the Big Data ecosystem for studying the climate change.
Big Data, Climate Change, Science, Strata
- Privacy, Security and Ethics in Process Mining - Dec 21, 2016.
Data Privacy, Security and Ethics are hot yet complex topics in the business and data science world. This important article talks about and provide guidelines for privacy, security and ethics, specifically in the context of Process Mining.
Pages: 1 2
Anonymity, Ethics, Privacy, Process Mining, Security
4 Reasons Your Machine Learning Model is Wrong (and How to Fix It) - Dec 21, 2016.
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
Bias, Overfitting, Variance
- Data Science Basics: Power Laws and Distributions - Dec 21, 2016.
Power laws and other relationships between observable phenomena may not seem like they are of any interest to data science, at least not to newcomers to the field, but this post provides an overview and suggests how they may be.
Beginners, Data Science, Distribution, Zipf's Law
- Data Sources for Cool Data Science Projects - Dec 20, 2016.
One of the biggest obstacles to successful projects has been getting access to interesting data. Here are some more cool public data sources you can use for your next project.
Data Incubator, Datasets, Elections, Healthcare, Michael Li
Machine Learning & Artificial Intelligence: Main Developments in 2016 and Key Trends in 2017 - Dec 20, 2016.
As 2016 comes to a close and we prepare for a new year, check out the final instalment in our "Main Developments in 2016 and Key Trends in 2017" series, where experts weigh in with their opinions.
2017 Predictions, AI, Artificial Intelligence, Machine Learning, Predictions
- ResNets, HighwayNets, and DenseNets, Oh My! - Dec 19, 2016.
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
Convolutional Neural Networks, Deep Learning, Neural Networks
- The 5 Basic Types of Data Science Interview Questions - Dec 16, 2016.
Data science interviews are notoriously complex, but most of what they throw at you will fall into one of these categories.
Data Science, Interview Questions, Springboard
- Introduction to Bayesian Inference - Dec 16, 2016.
Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.
Bayesian, Datascience.com, Inference, Probability
- Top 2016 KDnuggets Stories: Must-Know Data Science Interview Q&A, 10 Algorithms Machine Learning Engineers Need to Know - Dec 15, 2016.
Also 20 Questions to Detect Fake Data Scientists; Software used for Analytics, Data Science, Machine Learning projects; Top Algorithms and Methods Used by Data Scientists
Top stories
- Artificial Intelligence and Life in 2030 - Dec 15, 2016.
Read this engaging overview of a report from the Stanford University 100 year study of Artificial Intelligence, “a long-term investigation of the field of Artificial Intelligence (AI) and its influences on people, their communities, and society.”
AI, Artificial Intelligence, Future
50+ Data Science, Machine Learning Cheat Sheets, updated - Dec 14, 2016.
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
Cheat Sheet, Data Science, Django, Hadoop, Java, Machine Learning, MATLAB, Python, R
- The Costs of Misclassifications - Dec 14, 2016.
Importance of correct classification and hazards of misclassification are subjective or we can say varies on case-to-case. Lets see how cost of misclassification is measured from monetary perspective.
Accuracy, Classification, Cost Sensitive, Salford Systems
- Data Science Basics: What Types of Patterns Can Be Mined From Data? - Dec 14, 2016.
Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.
Beginners, Classification, Data Science, Frequent Pattern Mining, Outliers, Regression
Data Science, Predictive Analytics Main Developments in 2016 and Key Trends for 2017 - Dec 13, 2016.
Key themes included the polling failures in 2016 US Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.
Pages: 1 2
2017 Predictions, Data Science, John Elder, Kirk D. Borne, Predictive Analytics, Tom Davenport
- Data Analytics Models in Quantitative Finance and Risk Management - Dec 13, 2016.
We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.
Data Analytics, Feature Selection, Finance, Regression, Risk Modeling
- arXiv Paper Spotlight: Why Does Deep and Cheap Learning Work So Well? - Dec 13, 2016.
The recent paper at hand approaches explaining deep learning from a different perspective, that of physics, and discusses the role of "cheap learning" (parameter reduction) and how it relates back to this innovative perspective.
Academics, arXiv, Deep Learning, Machine Learning
- 4 Cognitive Bias Key Points Data Scientists Need to Know - Dec 9, 2016.
Cognitive biases are inherently problematic in a variety of fields, including data science. Is this something that can be mitigated? A solid understanding of cognitive biases is the best weapon, which this overview hopes to help provide.
Bias, Cognitive Bias, Confirmation Bias
- Introduction to K-means Clustering: A Tutorial - Dec 9, 2016.
A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.
Clustering, Datascience.com, K-means, Python
- Artificial Neural Networks (ANN) Introduction, Part 2 - Dec 9, 2016.
Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.
Algobeans, Deep Learning, Neural Networks
- Artificial Neural Networks (ANN) Introduction, Part 1 - Dec 8, 2016.
This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.
Algobeans, Image Recognition, MNIST, Neural Networks
- The Best Metric to Measure Accuracy of Classification Models - Dec 7, 2016.
Measuring accuracy of model for a classification problem (categorical output) is complex and time consuming compared to regression problems (continuous output). Let’s understand key testing metrics with example, for a classification problem.
Pages: 1 2
Accuracy, Classification, CleverTap, Measurement, Metrics, Precision, Unbalanced
- Free ebooks: Machine Learning with Python and Practical Data Analysis - Dec 5, 2016.
Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.
Data Analysis, Free ebook, Machine Learning, Packt Publishing, Python
Why Deep Learning is Radically Different From Machine Learning - Dec 5, 2016.
There is a lot of confusion these days about Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL), yet the distinction is very clear to practitioners in these fields. Are you able to articulate the difference?
Deep Learning, Machine Learning
- Smart Data Platform – The Future of Big Data Technology - Dec 2, 2016.
Data processing and analytical modelling are major bottlenecks in today’s big data world, due to need of human intelligence to decide relationships between data, required data engineering tasks, analytical models and it’s parameters. This article talks about Smart Data Platform to help to solve such problems.
Big Data, Big Data Analytics, China, Data Processing, Modeling, TalkingData
- Random Forests® in Python - Dec 2, 2016.
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
Algorithms, Classification, Ensemble Methods, Python, random forests algorithm, Yhat
The hard thing about deep learning - Dec 1, 2016.
It’s easy to optimize simple neural networks, let’s say single layer perceptron. But, as network becomes deeper, the optmization problem becomes crucial. This article discusses about such optimization problems with deep neural networks.
CA, Deep Learning, Neural Networks, NP-hard, Optimization, San Jose, Strata
- Top Reasons Why Big Data, Data Science, Analytics Initiatives Fail - Dec 1, 2016.
We examine the main reasons for failure in Big Data, Data Science, and Analytics projects which include lack of clear mandate, resistance to change, and not asking the right questions, and what can be done to address these problems.
Big Data, Data Science, Failure, Project Fail
- Top 10 Amazon Books in Artificial Intelligence & Machine Learning, 2016 Edition - Nov 30, 2016.
Given the ongoing explosion in interest for all things Data Science, Artificial Intelligence, Machine Learning, etc., we have updated our Amazon top books lists from last year. Here are the 10 most popular titles in the AI & Machine Learning category.
AI, Amazon, Artificial Intelligence, Books, Machine Learning
- 10 Tips to Improve your Data Science Interview - Nov 29, 2016.
Interviewing is a skill. Here are 10 tips and resources to improve your Data Science interviews.
Career, Data Science, Interview Questions, Skills
Machine Learning vs Statistics - Nov 29, 2016.
Machine learning is all about predictions, supervised learning, and unsupervised learning, while statistics is about sample, population, and hypotheses. But are they actually that different?
Machine Learning, Statistics
- Introduction to Machine Learning for Developers - Nov 28, 2016.
Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning.
Pages: 1 2
Beginners, Classification, Clustering, Machine Learning, Pandas, Python, R, scikit-learn, Software Developer
Continuous improvement for IoT through AI / Continuous learning - Nov 25, 2016.
In reality, especially for IoT, it is not like once an analytics model is built, it will give the results with same accuracy till the end of time. Data pattern changes over the time which makes it absolutely important to learn from new data and improve/recalibrate the models to get correct result. Below article explain this phenomenon of continuous improvement in analytics for IoT.
AI, Deployment, IoT, Machine Learning, Model Performance, Realtime Analytics
- Deep Learning Research Review: Reinforcement Learning - Nov 25, 2016.
This edition of Deep Learning Research Review explains recent research papers in Reinforcement Learning (RL). If you don't have the time to read the top papers yourself, or need an overview of RL in general, this post has you covered.
Pages: 1 2
Deep Learning, Machine Learning, Reinforcement Learning
- Linear Regression, Least Squares & Matrix Multiplication: A Concise Technical Overview - Nov 24, 2016.
Linear regression is a simple algebraic tool which attempts to find the “best” line fitting 2 or more attributes. Read here to discover the relationship between linear regression, the least squares method, and matrix multiplication.
Algorithms, Linear Regression
- Top 10 Facebook Groups for Big Data, Data Science, and Machine Learning - Nov 23, 2016.
Social media now not only shares friendship connections or photos of “selfies” but also spreads from political media to science information. Social network members are tending to more eagerly learn about big data, data science and machine learning through groups. We review the ten largest Facebook groups in this area.
Big Data, Data Science, Facebook, Machine Learning
- Cartoon: Thanksgiving, Big Data, and Turkey Data Science. - Nov 23, 2016.
We revisit KDnuggets Thanksgiving cartoon, which examines the predicament of one group of fowl Data Scientists.
Big Data, Cartoon, Thanksgiving
- Predictive Science vs Data Science - Nov 22, 2016.
Is Predictive Science accurately represented by the term Data Science? As a matter of fact, are any of Data Science's constituent sciences well-represented by the umbrella term? This post discusses a few of these points at a high level.
Algorithms, Applied Statistics, Data Science, Prediction
Top 20 Python Machine Learning Open Source Projects, updated - Nov 21, 2016.
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
GitHub, Machine Learning, Open Source, Python, scikit-learn
- Implementing a CNN for Human Activity Recognition in Tensorflow - Nov 21, 2016.
In this post, we will see how to employ Convolutional Neural Network (CNN) for HAR, that will learn complex features automatically from the raw accelerometer signal to differentiate between different activities of daily life.
Pages: 1 2
Convolutional Neural Networks, Deep Learning, TensorFlow, Time Series Classification
- Data Avengers… Assemble! - Nov 19, 2016.
The Avengers are perfectly capable of defending the Earth from our worst enemies. But are they up to the task of taking care of our data? Read this terribly punny "opinion" piece to find out.
Comic, Data Science, Data Science Team
- Questions To Ask When Moving Machine Learning From Practice to Production - Nov 18, 2016.
An overview of applying machine learning techniques to solve problems in production. This articles covers some of the varied questions to ponder when incorporating machine learning into teams and processes.
Data Science, Deep Learning, Deployment, Machine Learning, Production
- Process Mining: Where Data Science and Process Science Meet - Nov 17, 2016.
A data scientist without Process Mining training is ill-equipped to uncover the organization’s real processes, analyze compliance, diagnose bottlenecks and improve processes, so improve your skills with a new version of the free Coursera course "Process Mining: Data Science in Action" will start on November 28, 2016.
Book, Coursera, Online Education, Process Mining
- Deep Learning Reading Group: Skip-Thought Vectors - Nov 17, 2016.
Skip-thought vectors take inspiration from Word2Vec skip-gram and attempt to extend it to sentences, and are created using an encoder-decoder model. Read on for an overview of the paper.
Deep Learning, Lab41, Natural Language Processing, Neural Networks, word2vec
- Combining Different Methods to Create Advanced Time Series Prediction - Nov 16, 2016.
The results from combining methods for time series prediction have been quite promising. However, the degree of error for long-term predictions is still quite high. Sounds like a challenge, so some new experiments are forthcoming!
ARIMA, Data Science, Machine Learning, Prediction, Time Series
How Bayesian Inference Works - Nov 15, 2016.
Bayesian inference isn’t magic or mystical; the concepts behind it are completely accessible. In brief, Bayesian inference lets you draw stronger conclusions from your data by folding in what you already know about the answer. Read an in-depth overview here.
Pages: 1 2 3
Bayes Rule, Bayes Theorem, Bayesian, Inference, Statistics
Data Science and Big Data, Explained - Nov 14, 2016.
This article is meant to give the non-data scientist a solid overview of the many concepts and terms behind data science and big data. While related terms will be mentioned at a very high level, the reader is encouraged to explore the references and other resources for additional detail.
Beginners, Big Data, Data Science, Explained
- An Intuitive Explanation of Convolutional Neural Networks - Nov 11, 2016.
This article provides a easy to understand introduction to what convolutional neural networks are and how they work.
Pages: 1 2 3
Convolutional Neural Networks, Deep Learning, Explanation, Machine Learning, Neural Networks
Top 10 Amazon Books in Data Mining, 2016 Edition - Nov 11, 2016.
Given the ongoing explosion in interest for all things Data Mining, Data Science, Analytics, Big Data, etc., we have updated our Amazon top books lists from last year. Here are the 10 most popular titles in the Data Mining category.
Amazon, Books, Data Mining, Data Science
- A Reference Architecture for Self-Service Analytics - Nov 10, 2016.
The keys to self-service analytics success are organizational. In addition to a governed self-service architecture, companies need to establish governance committees and gateways, create federated organizations with co-located BI developers, and provide continuous education, training, and support. Learn how to do this in this report.
Analytics, Architecture, Self-service
- Parallelism in Machine Learning: GPUs, CUDA, and Practical Applications - Nov 10, 2016.
The lack of parallel processing in machine learning tasks inhibits economy of performance, yet it may very well be worth the trouble. Read on for an introductory overview to GPU-based parallelism, the CUDA framework, and some thoughts on practical implementation.
Pages: 1 2
Algorithms, CUDA, GPU, NVIDIA, Parallelism
- Top KDnuggets tweets, Nov 2-8: 35 #OpenSource tools for Internet of Things; An Introduction to Ensemble Learners - Nov 9, 2016.
21 Must-Know #DataScience Interview Questions with Answers; Big Data Science: Expectation vs. Reality; Big #DataScience: Expectation vs. Reality; The 10 Algorithms #MachineLearning Engineers Need to Know.
Data Science, IoT, Top tweets
- A Quick Introduction to Neural Networks - Nov 9, 2016.
This article provides a beginner level introduction to multilayer perceptron and backpropagation.
Pages: 1 2 3
Backpropagation, Deep Learning, Machine Learning, Neural Networks
- How to Rank 10% in Your First Kaggle Competition - Nov 9, 2016.
This post presents a pathway to achieving success in Kaggle competitions as a beginner. The path generalizes beyond competitions, however. Read on for insight into succeeding while approaching any data science project.
Pages: 1 2 3 4
Beginners, Competition, Data Science, Kaggle, Machine Learning, Python
Trump, Failure of Prediction, and Lessons for Data Scientists - Nov 9, 2016.
The shocking and unexpected win of Donald Trump of presidency of the United States has once again showed the limits of Data Science and prediction when dealing with human behavior.
Donald Trump, Elections, Failure, Hillary Clinton, Nate Silver, Poll
- Deep Learning cleans podcast episodes from ‘ahem’ sounds - Nov 8, 2016.
“3.5 mm audio jack… Ahem!!” where did you hear that? ;) Well, this post is not about Google Pixel vs iPhone 7, but how to remove ugly “Ahem” sound from a speech using deep convolutional neural network. I must say, very interesting read.
Convolutional Neural Networks, Deep Learning, Deep Neural Network, Neural Networks, Podcast, Speech
- Practical Data Science: Building Minimum Viable Models - Nov 8, 2016.
Data Science for startups based on data: Minimum Valuable Model, a new concept to avoid a full scale 95% accurate data science model. Want to know more about MVM? Have a look at this interesting article.
Big Data, Data Science, Startups
- Data Science Basics: An Introduction to Ensemble Learners - Nov 8, 2016.
New to classifiers and a bit uncertain of what ensemble learners are, or how different ones work? This post examines 3 of the most popular ensemble methods in an approach designed for newcomers.
Beginners, Boosting, Data Science, Ensemble Methods
- Largest Dataset Analyzed Poll shows surprising stability, more junior Data Scientists - Nov 8, 2016.
The majority (57%) of respondents only worked with Gigabyte range data. More junior Data Scientists enter the market, but Petabyte Big Data Scientists still stand apart.
Asia, Big Data, Datasets, Europe, Largest, Poll, USA
- Agilience Top Artificial Intelligence, Machine Learning Authorities - Nov 7, 2016.
Agilience developed a new way to find authorities in social media across many fields of interest. In previous post we reviewed the top authorities in Data Mining and Data science; in this post we review top authorities in Artificial Intelligence and Machine Learning which includes Vineet Vashishta, Kirk D. Borne, KDnuggets, James Kobielus, Kaggle and more.
Pages: 1 2
About KDnuggets, Agilience, AI, Artificial Intelligence, Influencers, Kaggle, Kirk D. Borne, Machine Learning
- Top /r/MachineLearning Posts, October: NSFW Image Recognition, Differentiable Neural Computers, Hinton on Coursera - Nov 4, 2016.
NSFW Image Recognition, Differentiable Neural Computers, Hinton's Neural Networks for Machine Learning Coursera course; Introducing the AI Open Network; Making a Self-driving RC Car
DeepMind, Geoff Hinton, Image Recognition, Machine Learning, Neural Networks, Reddit, Self-Driving Car
- An NLP Approach to Analyzing Twitter, Trump, and Profanity - Nov 3, 2016.
Who swears more? Do Twitter users who mention Donald Trump swear more than those who mention Hillary Clinton? Let’s find out by taking a natural language processing approach (or, NLP for short) to analyzing tweets.
Pages: 1 2
Donald Trump, Natural Language Processing, NLP, Twitter
- Artificial Intelligence Classification Matrix - Nov 3, 2016.
There might be several different ways to think around machine intelligence startups; too narrow of a framework might be counterproductive given the flexibility of the sector and the facility of transitioning from one group to another. Check out this categorization matrix.
AI, Artificial Intelligence, Classification, Startups
- Decision Tree Classifiers: A Concise Technical Overview - Nov 3, 2016.
The decision tree is one of the oldest and most intuitive classification algorithms in existence. This post provides a straightforward technical overview of this brand of classifiers.
Algorithms, C4.5, CART, Decision Trees
Eight Things an R user Will Find Frustrating When Trying to Learn Python - Nov 2, 2016.
Are you an R user considering learning Python? Here's some insight into what you may be up against, and what, specifically, you may find frustrating. But don't worry, it's not all terrible.
Python, R
- Evaluating HTAP Databases for Machine Learning Applications - Nov 2, 2016.
Businesses are producing a greater number of intelligent applications; which traditional databases are unable to support. A new class of databases, Hybrid Transactional and Analytical Processing (HTAP) databases, offers a variety of capabilities with specific strengths and weaknesses to consider. This article aims to give application developers and data scientists a better understanding of the HTAP database ecosystem so they can make the right choice for their intelligent application.
Pages: 1 2
Big Data, Data Processing, HTAP, Oracle, SAP, Splice Machine, SQL
- Do You Suffer From Analytic Personality Disorder (APD)? - Nov 2, 2016.
Read this lighthearted take on Analytics Personality Disorder, a (nonexistent) syndrome for those obsessed with analytics.
Analytics, Humor, Psychology, Society
- Introduction to Trainspotting: Computer Vision, Caltrain, and Predictive Analytics - Nov 1, 2016.
We previously analyzed delays using Caltrain’s real-time API to improve arrival predictions, and we have modeled the sounds of passing trains to tell them apart. In this post we’ll start looking at the nuts and bolts of making our Caltrain work possible.
Computer Vision, Raspberry Pi, SVDS, TensorFlow
- Data Science 101: How to get good at R - Nov 1, 2016.
Everybody talks about R programming, how to learn, how to be good at it. But in this article, Ari Lamstein tells us his story about why and how he started with R along with how to publish, market and monetise R projects.
Ari Lamstein, Beginners, Data Science, Monetizing, Programming, R
- How Can Lean Six Sigma Help Machine Learning? - Nov 1, 2016.
The data cleansing phase alone is not sufficient to ensure the accuracy of the machine learning, when noise / bias exists in input data. The lean six sigma variance reduction can improve the accuracy of machine learning results.
Data Cleaning, Machine Learning, Predictive Analytics, Statistics
- Cartoon: Scary Big Data - Oct 29, 2016.
What do Halloween and Big Data have in common? Both can be scary, as new KDnuggets cartoon shows.
Big Data, Cartoon, Halloween, Healthcare, Privacy
Machine Learning: A Complete and Detailed Overview - Oct 28, 2016.
This is an overview (with links) to a 5-part series on introductory machine learning. The set of tutorials is comprehensive, yet succinct, covering many important topics in the field (and beyond).
Machine Learning
- Using Machine Learning to Detect Malicious URLs - Oct 28, 2016.
This is a write-up of an experiment employing a machine learning model to identify malicious URLs. The author provides a link to the code for you to try yourself.
Cybersecurity, Python, Security
- Automated Machine Learning: An Interview with Randy Olson, TPOT Lead Developer - Oct 28, 2016.
Read an insightful interview with Randy Olson, Senior Data Scientist at University of Pennsylvania Institute for Biomedical Informatics, and lead developer of TPOT, an open source Python tool that intelligently automates the entire machine learning process.
Automated Data Science, Automated Machine Learning, Machine Learning, Python, scikit-learn
- Learn Data Science in 8 (Easy) Steps - Oct 27, 2016.
Want to learn data science? Check out these 8 (easy) steps to set out in the right direction!
Pages: 1 2
Big Data, Data Science, DataCamp, Machine Learning
Big Data Science: Expectation vs. Reality - Oct 27, 2016.
The path to success and happiness of the data science team working with big data project is not always clear from the beginning. It depends on maturity of underlying platform, their cross skills and devops process around their day-to-day operations.
Big Data, Big Data Engineer, Data Science, Data Science Team, DevOps
- Frequent Pattern Mining and the Apriori Algorithm: A Concise Technical Overview - Oct 27, 2016.
This post provides a technical overview of frequent pattern mining algorithms (also known by a variety of other names), along with its most famous implementation, the Apriori algorithm.
Algorithms, Apriori, Association Rules, Frequent Pattern Mining
- What is Academic Torrents and Where is Data Sharing Going? - Oct 26, 2016.
Learn more about Academic Torrents, a platform for researchers to share data consisting of a site where users can search for datasets, and a BitTorrent backbone which makes sharing data scalable and fast.
Datasets, Reproducibility, Research
5 EBooks to Read Before Getting into A Machine Learning Career - Oct 21, 2016.
A carefully-curated list of 5 free ebooks to help you better understand the various aspects of what machine learning, and skills necessary for a career in the field.
Bayesian, Data Science, Deep Learning, Free ebook, Machine Learning, Reinforcement Learning
- Jupyter Notebook Best Practices for Data Science - Oct 20, 2016.
Check out this overview of Jupyter notebook best practices as pertains to data science. Novice or expert, you may find something of use here.
Data Science, Jupyter, Python, SVDS
A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18! - Oct 20, 2016.
This post outlines setting up a neural network in Python using Scikit-learn, the latest version of which now has built in support for Neural Network models.
Pages: 1 2
Beginners, Machine Learning, Neural Networks, Python, scikit-learn
- Clustering Key Terms, Explained - Oct 18, 2016.
Getting started with Data Science or need a refresher? Clustering is among the most used tools of Data Scientists. Check out these 10 Clustering-related terms and their concise definitions.
Clustering, Explained, Feature Selection, K-means, Key Terms
- LinkedIn Knowledge Graph – KDnuggets Interview - Oct 18, 2016.
We interview LinkedIn about their recently published LinkedIn Knowledge Graph which connects their many millions of members, jobs, companies, and more.
Data Scientist, Deepak Agarwal, Knowledge Graph, LinkedIn
- MLDB: The Machine Learning Database - Oct 17, 2016.
MLDB is an opensource database designed for machine learning. Send it commands over a RESTful API to store data, explore it using SQL, then train machine learning models and expose them as APIs.
Classification, Database, Machine Learning, TensorFlow, Transfer Learning
Top 10 Data Science Videos on Youtube - Oct 17, 2016.
Learning and the future are the key topics in the recent Youtube videos on Data Science. The main questions revolve around: “how to become a Data Scientist”, “what is a data scientist”, and “where data science is going”. But why there is so little explanation of data science to the masses?
Pages: 1 2
Data Science, Data Scientist, DJ Patil, Online Education, R, Videolectures, Youtube
Artificial Intelligence, Deep Learning, and Neural Networks, Explained - Oct 14, 2016.
This article is meant to explain the concepts of AI, deep learning, and neural networks at a level that can be understood by most non-practitioners, and can also serve as a reference or review for technical folks as well.
AI, Artificial Intelligence, Deep Learning, Explained, Neural Networks
- EDISON Data Science Framework to define the Data Science Profession - Oct 14, 2016.
EDISON Data Science Framework provides conceptual, instructional and policy components required to establish the Data Science profession.
Certification, Data Science, Data Science Certificate, Data Science Education, Data Scientist
- The R Graph Gallery Data Visualization Collection - Oct 13, 2016.
Welcome to the R graph gallery, a collection of R graph examples, organized by chart type, searchable by R function, with reproducible code and explanation.
Art, Data Visualization, ggplot2, Graphics, R, Visualization
- Top 12 Interesting Careers to Explore in Big Data - Oct 12, 2016.
From data driven strategies to decision making, the true worth of Big Data has been realized, and has led to opening up of amazing career choices. Check out these 12 interesting careers to explore in Big Data.
Analyst, Big Data, Big Data Engineer, Business Analytics, Data Science, Data Scientist, Machine Learning Scientist, Simplilearn, Statistician
- KDnuggets™ News 16:n36, Oct 12: Battle of the Data Science Venn Diagrams; 9 Bizarre and Surprising Insights; ROI in Big Data Analytics - Oct 12, 2016.
Battle of the Data Science Venn Diagrams; Top September Stories in KDnuggets; Open Images Dataset; Still Searching for ROI in Big Data Analytics?
Big Data ROI, Data Science, Ethics, Venn Diagram
- Here’s How IT Departments are Using Big Data - Oct 10, 2016.
The use cases for big data are clear when it comes to areas like marketing, healthcare, and retail, but IT’s use of big data is a little less clear. Recently, however, some IT departments are finding ways to use big data to improve their individual operations along with that of the entire organization.
Big Data, Business
- Adversarial Validation, Explained - Oct 7, 2016.
This post proposes and outlines adversarial validation, a method for selecting training examples most similar to test examples and using them as a validation set, and provides a practical scenario for its usefulness.
Pages: 1 2
Adversarial, Explained, Training, Validation
- Top /r/MachineLearning Posts, September: Open Images Dataset; Whopping Deep Learning Grant; Advanced ML Courseware - Oct 7, 2016.
Google Research announces the Open Images dataset; Canadian Government Deep Learning Research grant; DeepMind: WaveNet - A Generative Model for Raw Audio; Machine Learning in a Year - From total noob to using it at work; Phd-level machine learning courses; xkcd: Linear Regression
Canada, Courses, Deep Learning, Generative Models, Geoff Hinton, Machine Learning, Reddit, xkcd
Battle of the Data Science Venn Diagrams - Oct 6, 2016.
First came Drew Conway's data science Venn diagram. Then came all the rest. Read this comparative overview of data science Venn diagrams for both the insight into the profession and the humor that comes along for free.
Pages: 1 2
Data Science, Drew Conway, Venn Diagram
Automated Data Science & Machine Learning: An Interview with the Auto-sklearn Team - Oct 4, 2016.
This is an interview with the authors of the recent winning KDnuggets Automated Data Science and Machine Learning blog contest entry, which provided an overview of the Auto-sklearn project. Learn more about the authors, the project, and automated data science.
Automated, Automated Data Science, Automated Machine Learning, Competition, Machine Learning, scikit-learn
- Embedded Analytics: The Future of Business Intelligence - Sep 30, 2016.
An overview of the evolution of Business Intelligence, and some insight into where its future lie: embedded analytics.
Analytics, API, Business Intelligence
- Deep Learning Reading Group: SqueezeNet - Sep 29, 2016.
This paper introduces a small CNN architecture called “SqueezeNet” that achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters.
Compression, Deep Learning, Lab41, Machine Learning, Neural Networks
- Data Science of Sales Calls: The Surprising Words That Signal Trouble or Success - Sep 29, 2016.
While not as profound a problem as uncovering the secrets of the universe, how to conduct a successful sales conversation is an age-old problem, impacting millions of people every day.
Gong.io, Machine Learning, Sales, Speech Recognition
Top Data Scientist Claudia Perlich on Biggest Issues in Data Science - Sep 29, 2016.
Find out what top data scientist Claudia Perlich believes are - and are not - the biggest issues in data science today, and why spending 80% of their time with data preparation is not a problem.
Claudia Perlich, Data Science
- Data Science Basics: Data Mining vs. Statistics - Sep 28, 2016.
As a beginner I was confused at the relationship between data mining and statistics. This is my attempt to help straighten out this connection for others who may now be in my old shoes.
Beginners, Data Mining, Statistics
Data Science for Internet of Things (IoT): Ten Differences From Traditional Data Science - Sep 26, 2016.
The connected devices (The Internet of Things) generate more than 2.5 quintillion bytes of data daily. All this data will significantly impact business processes and the Data Science for IoT will take increasingly central role. Here we outline 10 main differences between Data Science for IoT and traditional Data Science.
Data Science, Deep Learning, IoT, Privacy, Robots
- Comparing Clustering Techniques: A Concise Technical Overview - Sep 26, 2016.
A wide array of clustering techniques are in use today. Given the widespread use of clustering in everyday data mining, this post provides a concise technical overview of 2 such exemplar techniques.
Algorithms, Clustering, K-means, Machine Learning
- Top 16 Active Big Data, Data Science Leaders on LinkedIn - Sep 23, 2016.
Who are the most active Big Data, Data Science Influencers and Leaders on LinkedIn? We analyze the data and bring you the list of key people to follow.
About Gregory Piatetsky, Bernard Marr, Big Data, Big Data Influencers, Carla Gentry, Data Science, DJ Patil, Influencers, LinkedIn, Tom Davenport
- Deep Learning Reading Group: Deep Residual Learning for Image Recognition - Sep 22, 2016.
Published in 2015, today's paper offers a new architecture for Convolution Networks, one which has since become a staple in neural network implementation. Read all about it here.
Academics, Convolutional Neural Networks, Deep Learning, Image Recognition, Lab41, Machine Learning, Neural Networks
- Data Science Basics: 3 Insights for Beginners - Sep 22, 2016.
For data science beginners, 3 elementary issues are given overview treatment: supervised vs. unsupervised learning, decision tree pruning, and training vs. testing datasets.
Algorithms, Beginners, Datasets, Overfitting, Supervised Learning, Unsupervised Learning
- Support Vector Machines: A Concise Technical Overview - Sep 21, 2016.
Support Vector Machines remain a popular and time-tested classification algorithm. This post provides a high-level concise technical overview of their functionality.
Algorithms, Machine Learning, Support Vector Machines
9 Key Deep Learning Papers, Explained - Sep 20, 2016.
If you are interested in understanding the current state of deep learning, this post outlines and thoroughly summarizes 9 of the most influential contemporary papers in the field.
Pages: 1 2 3
Academics, Deep Learning, Explained, Neural Networks
- The Great Algorithm Tutorial Roundup - Sep 20, 2016.
This is a collection of tutorials relating to the results of the recent KDnuggets algorithms poll. If you are interested in learning or brushing up on the most used algorithms, as per our readers, look here for suggestions on doing so!
Algorithms, Clustering, Decision Trees, K-nearest neighbors, Machine Learning, PCA, Poll, random forests algorithm, Regression, Statistics, Text Mining, Time Series, Visualization
- Random Forest®: A Criminal Tutorial - Sep 19, 2016.
Get an overview of Random Forest here, one of the most used algorithms by KDnuggets readers according to a recent poll.
Algobeans, CA, Crime, random forests algorithm, San Francisco
- Decision Trees: A Disastrous Tutorial - Sep 15, 2016.
Get a concise overview of decision trees here, one of the most used KDnuggets reader algorithms as measured in a recent poll.
Algobeans, Decision Trees, Titanic
- SlangSD: A Sentiment Dictionary for Slang Words - Sep 14, 2016.
The Slang Sentiment Dictionary (SlangSD) includes over 90,000 slang words together with their sentiment scores, facilitating sentiment analysis in user-generated contents.
Natural Language Processing, NLP, Sentiment Analysis
Top Algorithms and Methods Used by Data Scientists - Sep 12, 2016.
Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.
Pages: 1 2
Algorithms, Clustering, Data Visualization, Decision Trees, Poll, Regression
- Urban Sound Classification with Neural Networks in Tensorflow - Sep 12, 2016.
This post discuss techniques of feature extraction from sound in Python using open source library Librosa and implements a Neural Network in Tensorflow to categories urban sounds, including car horns, children playing, dogs bark, and more.
Pages: 1 2
Deep Learning, Feature Extraction, Machine Learning, Neural Networks, TensorFlow
- The (Not So) New Data Scientist Venn Diagram - Sep 12, 2016.
This post outlines a (relatively) new(er) Data Science-related Venn diagram, giving an update to Conway's classic, and providing further fuel for flame wars and heated disagreement.
Data Science, Data Scientist, Drew Conway, Venn Diagram, Yanir Seroussi
- Doing the Data Science That Drives Predictive Personalization - Sep 9, 2016.
Agile collaboration within data science teams is essential to the vision of customer analytics and personalization. Attend IBM DataFirst Launch Event on Sep 27 in New York City to engage with open-source community leaders and practitioners.
Clustering, Customer Analytics, IBM, New York City, NY
- Deep Learning Reading Group: Deep Networks with Stochastic Depth - Sep 8, 2016.
An concise overview of a recent paper which introduces a new way to perturb networks during training in order to improve their performance, stochastic depth networks.
Academics, Deep Learning, Lab41, Neural Networks
- A Beginner’s Guide To Understanding Convolutional Neural Networks Part 2 - Sep 8, 2016.
This is the second part of a thorough introductory treatment of convolutional neural networks. Have a look after reading the first part.
Pages: 1 2
Beginners, Convolutional Neural Networks, Deep Learning, Neural Networks
- Introducing Dask for Parallel Programming: An Interview with Project Lead Developer - Sep 7, 2016.
Introducing Dask, a flexible parallel computing library for analytics. Learn more about this project built with interactive data science in mind in an interview with its lead developer.
Analytics, Continuum Analytics, Dask, Data Science, Distributed Computing, Parallelism, Python, Scientific Computing
- KDnuggets™ News 16:n32, Sep 7: Cartoon: Data Scientist was sexiest job until…; Up to Speed on Deep Learning - Sep 7, 2016.
Cartoon: Data Scientist - the sexiest job of the 21st century until...; Up to Speed on Deep Learning: July Update; How Convolutional Neural Networks Work; Learning from Imbalanced Classes; What is the Role of the Activation Function in a Neural Network?
Balancing Classes, Convolutional Neural Networks, Data Scientist, Deep Learning, Neural Networks
A Beginner’s Guide To Understanding Convolutional Neural Networks Part 1 - Sep 6, 2016.
Interested in better understanding convolutional neural networks? Check out this first part of a very comprehensive overview of the topic.
Pages: 1 2
Beginners, Convolutional Neural Networks, Deep Learning, Neural Networks
- Cartoon: Labor Day in the era of Robotics - Sep 5, 2016.
Amidst all the discussion about robots and automation taking over human jobs, new KDnuggets cartoon looks at how Labor Day can evolve by 2050.
Automated, Cartoon, Labor Day, Robots, Skills
- The Evolution of IoT Edge Analytics: Strategies of Leading Players - Sep 2, 2016.
This article explores the significance and evolution of IoT edge analytics. Since the author believes that hardware capabilities will converge for large vendors, IoT analytics will be the key differentiator.
Analytics, Cisco, Dell, HPE, IBM, Intel, IoT, PMML
- The Human Vector: Incorporate Speaker Embeddings to Make Your Bot More Powerful - Sep 2, 2016.
One of the many ways in which bots can fail is by their (lack of) persona. Learn how speaker embeddings can help with this problem, and can help improve the persona of your bot.
Bots, Chatbot, Natural Language Processing
- Data Science vs Crime: Detecting Pickpocket Suspects from Transit Records - Sep 1, 2016.
A team of US and Chinese researchers has creatively used massive data collected by automated fare collectors for identifying thieves in the public transit systems. The system was tested in Beijing and was able to identify 93% of known pickpockets.
Anomaly Detection, Beijing, China, Crime, Hui Xiong, Mobility, Rutgers
- Learning from Imbalanced Classes - Aug 31, 2016.
Imbalanced classes can cause trouble for classification. Not all hope is lost, however. Check out this article for methods in which to deal with such a situation.
Pages: 1 2
Balancing Classes, Bayesian, Learning from Data, Sampling, Tom Fawcett
- How Convolutional Neural Networks Work - Aug 31, 2016.
Get an overview of what is going on inside convolutional neural networks, and what it is that makes them so effective.
Pages: 1 2
Brandon Rohrer, Convolutional Neural Networks, Image Recognition, Neural Networks