The Big Data ecosystem is just too damn big! It's complex, redundant, and confusing. There are too many layers in the technology stack, too many standards, and too many engines. Vendors? Too many. What is the user to do?
There are a lot of popular machine learning projects out there, but many more that are not. Which of these are actively developed and worth checking out? Here is an offering of 5 such projects.
Javascript may not be the conventional choice for machine learning, but there is no reason it cannot be used for such tasks. Here are the top libraries to facilitate machine learning in Javascript.
The confluence of data flywheels, the algorithm economy, and cloud-hosted intelligence means every company can now be a data company, every company can now access algorithmic intelligence, and every app can now be an intelligent app.
Data mining is a subfield of computer science which blends many techniques from statistics, data science, database theory and machine learning. Here are the major milestones and “firsts” in the history of data mining plus how it’s evolved and blended with data science and big data.
Check out the details on Andrew Ng's new book on building machine learning systems, and find out how to get your free copy of draft chapters as they are written.
The recent announcement of Microsoft’s acquisition of LinkedIn has raised many questions about how Microsoft will monetize this data. We examine LinkedIn value per user and compare to Google, Facebook, Yahoo, and Twitter.
This post shares some results of political text analytics performed on Twitter data. How negative are the US Presidential candidate tweets? How does the media mention the candidates in tweets? Read on to find out!
An open API is available on the internet for free. We review the growth of API economy and how organizations have been realizing the potential of open APIs in transforming their business.
This article touches upon an important but under-discussed topic of analytics readiness, including whether and when organizations should engage in analytics.
Part 1 of a 7 part series focusing on mining Twitter data for a variety of use cases. This first post lays the groundwork, and focuses on data collection.
An interesting discussion of the myriad methods in which startups may choose to acquire data, often the most overlooked and important aspect of a startup's success (or failure).
Support Vector Machine kernel selection can be tricky, and is dataset dependent. Here is some advice on how to proceed in the kernel selection process.
Where and how can machine learning be practically applied by insurers? And is it worth it? Read the white paper from insurance experts at AIG and Zurich.
Lack of data security can not only result in financial losses, but may also damage the reputation of organizations. Take a look at some of the most important data security best practices that can reduce the risks associated with analyzing a massive amount of data.
Machine learning has permeated data-driven businesses, which means almost all businesses. Here are a few areas where it’s possible that big corporations haven’t already eaten everybody’s lunch.
There are as many approaches to selecting features as there are statisticians since every statistician and their sibling has a POV or a paper on the subject. This is an overview of some of these approaches.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Another concise explanation of a machine learning concept by Sebastian Raschka. This time, Sebastian explains the difference between Deep Learning and "regular" machine learning.