2015 Nov Opinions, Interviews, Reports
All (99) | Courses, Education (8) | Meetings (8) | News, Features (13) | Opinions, Interviews, Reports (30) | Publications (10) | Software (7) | Top Tweets (5) | Tutorials, Overviews, How-Tos (15) | Webcasts (3)
- Will Balkanization of Data Science lead to one Empire or many Republics? - Nov 30, 2015.
We examine the “Technoslavia” of the Big Data and Data Science market and consider whether it is likely to lead to a unified empire or a federation of independent republics.
- Taming the Elephant: Advice to Director, Big Data Architect - Nov 30, 2015.
Every other day, there is a new big data software is released in the market. Which one is the right to build your product? Understand how to resolve this conundrum and role of decision makers.
- 5 Tribes of Machine Learning – Questions and Answers - Nov 27, 2015.
Leading researcher Pedro Domingos answers questions on 5 tribes of Machine Learning, Master Algorithm, No Free Lunch Theorem, Unsupervised Learning, Ensemble methods, 360-degree recommender, and more.
- Detecting In-App Purchase Fraud with Machine Learning - Nov 25, 2015.
Hacking applications allow users to make in-app purchases for free. With help from a few big games in the GROW data network we were able to build a model that classifies each purchase as real or fraud, with a very high level of accuracy.
- OpenText Data Digest Nov 20: The Last Mile of Big Data - Nov 24, 2015.
For this week, we provide some examples of visualizations that crunch their fair share of Big Data on the back end but present it in a way that meets the Last Mile challenge.
- Using Machine Learning To Predict Gender - Nov 24, 2015.
Here is an experiment from the CrowdFlower AI team, where they used user’s tweeter account link color, description, and a single random tweet with the word “and” or “the” in it and guessed who’s behind the curtain.
- The hardest parts of data science - Nov 24, 2015.
The hardest part of data science is not building an accurate model or obtaining good, clean data, but defining feasible problems and coming up with reasonable ways of measuring solutions.
- Bot or Not: an end-to-end data analysis in Python - Nov 23, 2015.
Twitter bots are programs that compose and post tweets without human intervention, and they range widely in complexity. Here we are building a classifier with pandas, NLTK, and scikit-learn to identify Twitter bots.
- H2O World 2015 – Day 3 Highlights - Nov 20, 2015.
Highlights from talks delivered by machine learning experts from Fast Forward Labs, H20.ai, Kaiser and Macy's at H2O World held in Mountain View.
- Big RAM is eating big data – Size of datasets used for analytics - Nov 20, 2015.
Here we analysed the KDnuggets surveys on the largest datasets used by practitioners to find out need for the Big Data tools over the Big RAM.
- What is the importance of Dark Data in Big Data world? - Nov 20, 2015.
Dark data is a subset of big data, but it constitutes the biggest portion of the total volume of big data collected by organizations in a year. We will discuss about what opportunities this holds for an organization.
- H2O World 2015 – Day 2 Highlights - Nov 19, 2015.
Highlights from talks delivered by machine learning experts from H20.ai, Jawbone, Stanford, Quora & PayPal at H2O World held in Mountain View.
- On Political Economy and Data Science: When A Discipline Is Not Enough - Nov 18, 2015.
Most non-trivial Data Science applications are interdisciplinary requiring collaboration across disciplines. We are just beginning to understand the nature of interdisciplinarity in Data Science and the risks of misunderstanding.
- The Data Science Conference 2015 Highlights - Nov 18, 2015.
Here are the highlights from The Data Science Conference 2015, Nov 12-13 at University of Chicago. A two-day conference on Data Science, big data, machine learning, artificial intelligence & predictive modeling discussions -"for professionals" by professionals.
- OpenText Data Digest Nov 13: Making Relevant Data Easy to See - Nov 17, 2015.
For this week, we provide some examples of how complex data can be displayed in an easy-to-understand fashion.
- The different data science roles in the industry - Nov 17, 2015.
Data science roles and responsibilities are diverse and skills required for them vary considerably. Here, we have described the different data science roles along with the skill set, technical knowledge and mindset required to carry it.
- H2O World 2015 – Day 1 Highlights - Nov 16, 2015.
Highlights from talks and tutorials delivered by machine learning experts at H2O World 2015 held in Mountain View.
- The Five Myths of Big Data - Nov 16, 2015.
Here, we are bursting couple of the myths which have been built around the big data. Ranging from does it predicts future, it is only for big businesses and is it a better data?
- TensorFlow Disappoints – Google Deep Learning falls shallow - Nov 16, 2015.
Google recently open-sourced its TensorFlow machine learning library, which aims to bring large-scale, distributed machine learning and deep learning to everyone. But does it deliver?
- Deep Learning, Language Understanding, and the Quest for Human Capacity Cognitive Computing - Nov 16, 2015.
To develop cognitive computing at human capacity understanding, deep learning research must heed what certain aspects of human symbol processing reveal about the architecture of the human mind.
- Hiring? Approving Mortgages? It’s the Same Thing (Risk …) - Nov 13, 2015.
Traditionally hiring and approving mortgage are completely different problems. But, when you look at them from a data science perspective, both things do have similar characteristics.
- A Community Event for Innovative Spark Apps: A Datapalooza Dispatch - Nov 12, 2015.
Datapalooza, which is holding its inaugural event this week in San Francisco, is proving to be a seedbed for innovation apps in the Spark community. James Kobielus describes the highlights.
- Avoiding Tunnel Vision in Peer Comparisons - Nov 12, 2015.
Comparing yourself to peers (benchmarking) lets you understand how you’re doing and identify performance gaps. Benchmarking is widespread but frequently misses useful and actionable insights. The proposed approach helps avoid the tunnel vision in benchmarking.
- Customer Study – Dealing with dirty, smelly, horrible data? - Nov 12, 2015.
If you have hands on experience with data cleaning and data engineering, Microsoft Data Platform group would love to hear about your challenges. This is for early influence on product development (not sales).
- How to discover stolen data using Hadoop and Big data? - Nov 11, 2015.
We discuss recent data breaches and present an approach that uses Hadoop and data fingerprint matching techniques to discover stolen data.
- What No One Tells You About Real-Time Machine Learning - Nov 9, 2015.
Real-time machine learning has access to a continuous flow of transactional data, but what it really needs in order to be effective is a continuous flow of labeled transactional data, and accurate labeling introduces latency.
- Why Deep Learning Works – Key Insights and Saddle Points - Nov 3, 2015.
A quality discussion on the theoretical motivations for deep learning, including distributed representation, deep architecture, and the easily escapable saddle point.
- How Data Science increased the profitability of the e-commerce industry? - Nov 3, 2015.
Data Science helps businesses provide a richer understanding of the customers by capturing and integrating the information on customers web behaviour, their life events, what led to the purchase of a product or service, how customers interact with different channels, and more.
- 6 crazy things Deep Learning and Topological Data Analysis can do with your data - Nov 2, 2015.
Want to analyze a high dimensional dataset and you are running out of options? Find out how Deep Learning combined with Topological Data Analysis can do exactly that and more.
- 5 Warning Signs that Turn Off Data Science Hiring Managers - Nov 2, 2015.
Here are some warning signs that will prevent managers from hiring you for a Data Science position. If your resume has one or more of them, make an effort to remove the risk factors.