In the final part of this 6 part series on the process of data science, and applying it to a Kaggle competition, building the predictive models is covered, and multiple algorithms are discussed.
This post is a concise overview of a few of the more interesting popular deep learning models to have appeared over the past year. Get up to speed and try a few of the models out for yourself.
HPE Haven OnDemand provides a native API based on cURL calls, as well as numerous language-specific APIs, providing maximum flexibility for developers. This cheat sheet will cover the native and Python text extraction APIs.
In the previous article, we looked at some of the ways to compare different numerical variables. In this article, we shall look at techniques to compare categorical variables with the help of an example.
We are always told that apples and oranges can’t be compared, they are completely different things. Learn as an analyst, how you deal with such difference and make sense of it on a daily basis.
Part 1 of a 7 part series focusing on mining Twitter data for a variety of use cases. This first post lays the groundwork, and focuses on data collection.
Support Vector Machine kernel selection can be tricky, and is dataset dependent. Here is some advice on how to proceed in the kernel selection process.
This second part of an introduction to linear regression moves past the topics covered in the first to discuss linearity, normality, outliers, and other topics of interest.
Part 4 of this fantastic 6 part series covering the process of data science, and its application to a Kaggle competition, focuses on feature extraction and data transformation.
In the conclusion to this two part tutorial, learn how to leverage HPE Haven OnDemand's Machine Learning APIs to build an audio/video analytics app with minimal time and effort.
An introductory overview of Matplotlib, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
In this first part of a two part tutorial, learn how to leverage HPE Haven OnDemand's Machine Learning APIs to build an audio/video analytics app with minimal time and effort.
Lack of data security can not only result in financial losses, but may also damage the reputation of organizations. Take a look at some of the most important data security best practices that can reduce the risks associated with analyzing a massive amount of data.
There are as many approaches to selecting features as there are statisticians since every statistician and their sibling has a POV or a paper on the subject. This is an overview of some of these approaches.
A set of free resources for learning machine learning, inspired by similar open source degree resources. Find links to books and book-length lecture notes for study.
This introduction to linear regression discusses a simple linear regression model with one predictor variable, and then extends it to the multiple linear regression model with at least two predictors.
This post shares some insight gained through years of building data-powered products, and discusses the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.
Another concise explanation of a machine learning concept by Sebastian Raschka. This time, Sebastian explains the difference between Deep Learning and "regular" machine learning.
This is part three in a fantastic 6 part series covering the process of data science, and the application of the process to a Kaggle competition. In this episode, data cleaning and preparation is covered.
A look at the four characteristics that differentiate data infrastructure development from traditional development, and the key issues to look out for.
It can be easy to get carried away with the deluge of big data and to rely on its abundance to deliver better models. However, use of data without context and objective could prove counterproductive; contextual and objective driven samples from the large volume and variety of data can be effective tools.
An introductory overview of NumPy, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
Analyzing Big Data without paying attention to its characteristics and objective can be detrimental, the fix for which can be correct and effective sampling. Read on to transform your Big Data to Smart Data.