Interview: Haile Owusu, Mashable on Riding the Wave of Viral Content
We discuss Mashable’s milestones, data-driven digital publishing, digital media tracking, viral prediction, and Mashable Velocity.
on Apr 29, 2015 in Content Curation, Haile Owusu, Interview, Mashable, Metrics, Natural Language Processing, Prediction
Data Scientists Thoughts that Inspire
Inspirational thoughts from leading data scientists, including Yann LeCun, Erin Shellman, Daniel Tunkelang, Claudia Perlich, and Jake Porway. What inspires you?
on Apr 29, 2015 in Andy Rey, Claudia Perlich, Daniel Tunkelang, Facebook, Jake Porway, LinkedIn, Yann LeCun
How to become a Data Scientist – brief answer
The most important steps to become a Data Scientist: learn Python, deep understanding of machine learning, try to be up-to-date. Check more details in the post.
on Apr 28, 2015 in Andrew Ng, Data Scientist, Geoff Hinton, Python, Quora
Interview: Mario Vinasco, Facebook on Advancing Marketing Analytics through Rigorous Experimentation
We discuss marketing analytics at Facebook, multi-channel performance assessment, success factors, lessons from Look Back feature, advice, and more.
on Apr 27, 2015 in Apache Hive, Career, Data Science, Experimentation, Facebook, Interview, Mario Vinasco, Marketing Analytics, Predictive Analytics, Trends
The Myth of Model Interpretability
Deep networks are widely regarded as black boxes. But are they truly uninterpretable in any way that logistic regression is not?
on Apr 27, 2015 in Deep Learning, Deep Neural Network, Interpretability, Support Vector Machines, Zachary Lipton
New Hybrid Rare-Event Sampling Technique for Fraud Detection
Proposed hybrid sampling methodology may prove useful when building and validating machine learning models for applications where target event is rare, such as fraud detection.
on Apr 26, 2015 in Bootstrap sampling, Fraud Detection, Sampling
Interview: Emmanuel Letouzé, Data-Pop Alliance on Big Data for Development and Future Prospects
We discuss the field of Big Data for Development, current projects and future plans for Data-Pop Alliance, public participation opportunities, advice, and more.
on Apr 25, 2015 in Advice, Big Data, Comic, Data-Pop Alliance, Emmanuel Letouze, Interview, Trends
Big Data Bootcamp, Austin: Day 3 Highlights
Highlights from the presentations by Big Data and Analytics leaders/consultants on day 3 of Big Data Bootcamp in Austin.
on Apr 24, 2015 in Accenture, Bootcamp, Forrester, Global Big Data Conference, Hadoop, HBase, Hortonworks, Infochimps, NoSQL
MapR on Open Data Platform: Why we declined
Why MapR declined to participate in the Open Data Platform? Our concerns include redundancy with Apache Software Foundation Governance, misdefined “core”, and lack of participation from Hadoop leaders.
on Apr 24, 2015 in Cloudera, Hadoop, MapR, Open Data Platform
Big Data Bootcamp, Austin: Day 2 Highlights
Highlights from the presentations by Big Data and Analytics leaders/consultants on day 2 of Big Data Bootcamp in Austin.
on Apr 23, 2015 in Bootcamp, Career, Cassandra, Data Analytics, DataStax, Global Big Data Conference, NoSQL
Interview: Emmanuel Letouzé, Data-Pop Alliance on the Role of Big Data in Economic Development
We discuss the emerging Big Data ecosystem, its key players, and the severe consequences of inadequate statistical capabilities across many African nations.
on Apr 23, 2015 in Africa, Data-Pop Alliance, Economics, ecosystem, Emmanuel Letouze, United Nations
Big Data Bootcamp, Austin: Day 1 Highlights
Highlights from the presentations by Big Data and Analytics leaders/consultants on day 1 of Big Data Bootcamp 2015 in Austin.
on Apr 22, 2015 in Apache Spark, Bootcamp, Hadoop, MapReduce, MongoDB, NoSQL, Relational Databases, Sharding, Spark SQL
Interview: Emmanuel Letouzé, Data-Pop Alliance on Big Data and Human Rights – A Complex Affair
We discuss the founding story of Data-Pop Alliance, the applications and implications of Big Data on Human Rights and the need for penetration of Data Literacy.
on Apr 22, 2015 in Challenges, Data Literacy, Data-Pop Alliance, Ebola, Emmanuel Letouze, Harvard, Interview, Opportunities
Deep Learning to Fight Crime
We look at how using Deep Learning, Spark, and H2O Machine Learning platform can be used to analyze and predict crime in San Francisco and Chicago.
on Apr 22, 2015 in Apache Spark, CA, Chicago, Crime, Deep Learning, H2O, IL, San Francisco
Interview: Michael Li, Data Incubator on Bridging the Data Science Skills Gap between Academia and Industry
We discuss the response from hiring companies, recommendations for aspirants, retaining data science talent, advice, and more.
on Apr 21, 2015 in Academics, Advice, Career, Data Science Skills, Industry, Interview, Machine Learning, Recommendations, Trends
Interview: Michael Li, Data Incubator on Data-driven Hiring for Data Scientists
We discuss the launch of the Data Incubator, its business model, why we need data-driven hiring, selection process for the incubator program and alumni feedback.
on Apr 20, 2015 in Data Incubator, Data Scientist, Fellowship, Hiring, Incubation, Interview, Michael Li, PhD
PAW San Francisco 5 Min Recap – Predictive Analytics World
PAW San Francisco: 550+ Data Professionals, 85+ conference sessions, 4 conferences, Dean Abbott on 3-legged stool of good data, domain expertise, and advanced analytics, and more.
on Apr 20, 2015 in CA, Dean Abbott, PAW, Predictive Analytics World, San Francisco
Interview: Ksenija Draskovic, Verizon on Conquering Fear and Cherishing Creativity for Success in Data Science
We discuss career advice, motivation, key qualities sought in Data Science practitioners, and more.
on Apr 17, 2015 in Advice, Career, Data Science, Interview, Ksenija Draskovic, Success, Verizon
Data Science 101: Preventing Overfitting in Neural Networks
Overfitting is a major problem for Predictive Analytics and especially for Neural Networks. Here is an overview of key methods to avoid overfitting, including regularization (L2 and L1), Max norm constraints and Dropout.
on Apr 17, 2015 in Neural Networks, Nikhil Buduma, Overfitting, Regularization
Interview: Ksenija Draskovic, Verizon on How to Not Get Lost in the Big Data Wilderness
We discuss recommendations for data-driven decision making, challenges and benefits of using unstructured data, managing expectations and key trends.
on Apr 16, 2015 in Analytics, Interview, Ksenija Draskovic, Success, Trends, Unstructured data, Verizon
Cloud Machine Learning Wars: Amazon vs IBM Watson vs Microsoft Azure
Amazon recently announced Amazon Machine Learning, a cloud machine learning solution for Amazon Web Services. Able to pull data effortlessly from RDS, S3 and Redshift, the product could pose a significant threat to Microsoft Azure ML and IBM Watson Analytics.
on Apr 16, 2015 in Amazon, Azure ML, IBM Watson, Logistic Regression, Machine Learning, MetaMind, Prediction, Regression, Zachary Lipton
Interview: Ksenija Draskovic, Verizon on Dissecting the Anatomy of Predictive Analytics Projects
We discuss Predictive Analytics use cases at Verizon Wireless, advantages of a unified data view, model selection and common causes of failure.
on Apr 15, 2015 in Customer Intelligence, Interview, Ksenija Draskovic, Optimization, Predictive Analytics, Project Fail, Use Cases, Verizon
Interview: Michael Lurye, Time Warner Cable on Key Lessons from Shifting to Hadoop
We discuss the key lessons from shifting to Hadoop, data management in today’s world, future of Data Science, advice and more.
on Apr 14, 2015 in Data Quality, Data Warehousing, Hadoop, Interview, Mike Lurye, Time Warner Cable, Trends
Interview: Michael Lurye, Time Warner Cable on Big Data and the Insatiable Demand for BI
We discuss EDM at Time Warner Cable, data sources, complementing legacy data warehouses with Big Data solutions, vendor selection and build vs. buy decision.
on Apr 13, 2015 in Big Data, Business Intelligence, Data Management, Data Warehouse, Hadoop, Interview, Mike Lurye, Time Warner Cable
Interview: Xia Wang, AstraZeneca on Big Data and the Promise of Effective Healthcare
We discuss challenges in analyzing text data, Big Data impact on translational bioinformatics, advice, desired skills in data scientists, and more.
on Apr 10, 2015 in Advice, AstraZeneca, Bioinformatics, Career, Challenges, Healthcare, Interview, Xia Wang
Algorithmia – How Marketplaces are Fostering Innovation?
We have a marketplace for almost everything – mobile apps, cabs, hotels, and what not. But, not for algorithms. Algorithmia takes up that challenge.
on Apr 9, 2015 in Algorithmia, API, California, Crowdsourcing, Innovation, Marketplace, Social Networks
Interview: Xia Wang, AstraZeneca on Unraveling Patient Treatment Journey by NLP on Clinical Notes
We discuss Analytics at AstraZeneca, prominent use cases, how NLP helped understanding patient treatment journey in diabetes, data sources, insights, and more.
on Apr 9, 2015 in AstraZeneca, Healthcare, Insights, NLP, Recommendations, Research, Xia Wang
Inside Deep Learning: Computer Vision With Convolutional Neural Networks
Deep Learning-powered image recognition is now performing better than human vision on many tasks. We examine how human and computer vision extracts features from raw pixels, and explain how deep convolutional neural networks work so well.
on Apr 9, 2015 in Computer Vision, Convolutional Neural Networks, Deep Learning, Image Recognition, Nikhil Buduma
Machine Learning 201: Does Balancing Classes Improve Classifier Performance?
The author investigates if balancing classes improves performance for logistic regression, SVM, and Random Forests, and finds where it helps the performance and where it does not.
on Apr 9, 2015 in Balancing Classes, random forests algorithm, Regression, SVM
Interview: Ravi Iyer, Ranker on Dealing with Inherent Bias in Crowdsourcing Data
We discuss the challenges of analyzing crowdsourcing data, tools and technologies, competitive landscape, advice, trends, and more.
on Apr 8, 2015 in Advice, Bias, Challenges, Crowdsourcing, Interview, Ranker, Ravi Iyer
Predictive Analytics Innovation Summit, San Diego: Day 2 Highlights
Highlights from the presentations by Predictive Analytics leaders from eBay, LinkedIn and Facebook on day 2 of Predictive Analytics Innovation Summit 2015 in San Diego.
on Apr 8, 2015 in A/B Testing, CA, eBay, Facebook, IE Group, LinkedIn, Marketing, Predictive Analytics, San Diego
Interview: Ravi Iyer, Ranker on Why Crowdsourcing Needs Data Science
We discuss the dynamics of Ranker crowdsourcing platform, key factors for effectiveness, role of data science in crowdsourcing, and more.
on Apr 7, 2015 in Analytics, Crowdsourcing, Data Science, Interview, Ranker, Ravi Iyer
Predictive Analytics Innovation Summit, San Diego: Day 1 Highlights
Highlights from the presentations by Predictive Analytics leaders from The Data Incubator, Tamr, Sony and Facebook on day 1 of Predictive Analytics Innovation Summit 2015 in San Diego.
on Apr 7, 2015 in CA, Data Curation, Facebook, IE Group, Marketing, Predictive Analytics, San Diego, Sony, Summit, Tamr
Be Smarter Than Your Devices: Learn About Big Data
If the Apple Watch rollout proves anything, it might be this: Going forward, we’ll all have to be as smart about data as our devices. Also, learn about the origins of "Big Data" term.
on Apr 7, 2015 in Apple Watch, Big Data, IoT, Nitin Indurkhya, Privacy, Tim Cook
Interview: Beth Diaz, Washington Post on How Dark Social is Shadowing Modern Analytics
We discuss recent events at Washington Post, growth initiatives, the growing pain of Dark Social, how to deal with it, audience analytics, advice and more.
on Apr 6, 2015 in Advice, Analytics, Beth Diaz, Challenges, Dark Social, Interview, Jeff Bezos, Washington Post
Big Data Developer Conference, Santa Clara: Day 3 Highlights
Highlights from the presentations/tutorials by Data Science leaders from VISA, Glassbeam, Unravel on day 3 of Big Data Developer Conference, Santa Clara.
on Apr 3, 2015 in Apache Spark, Developers, Global Big Data Conference, Hadoop, Highlights, Security, Spark SQL
Interview: Alessandro Gagliardi, Glassdoor on the Fun and Boring Part of Data Scientist Job
We discuss interesting trends, motivation, different aspects of data scientist job, advice, and more.
on Apr 3, 2015 in Advice, Alessandro Gagliardi, Career, Data Scientist, Glassdoor, Jobs, Trends
Big Data Developer Conference, Santa Clara: Day 2 Highlights
Highlights from the presentations/tutorials by Data Science leaders from Cloudera, LinkedIn, Intel, MapR, Locbit and others on day 2 of Big Data Developer Conference 2015.
on Apr 2, 2015 in Cloudera, Developers, Global Big Data Conference, Highlights, Hortonworks, Intel, LinkedIn, MapR, Security
Hazy Forecast for Consumer Privacy in the Next Decade
Majority of experts felt that developing a privacy framework that would be both popular and functional was next to impossible in the near future. With time, privacy is likely tol become a class issue with consumers who have the money having the ability to secure their data better.
on Apr 2, 2015 in Consumer Analytics, Hal Varian, Privacy
Big Data Developer Conference, Santa Clara: Day 1 Highlights
Highlights from the presentations/tutorials by Data Science leaders from ElephantScale, SciSpike, Twitter and Informatica on day 1 of Big Data Developer Conference, Santa Clara
on Apr 1, 2015 in Developers, Elephant Scale, Global Big Data Conference, Highlights, Informatica, MongoDB, Parquet, SciSpike, Twitter
Interview: Alessandro Gagliardi, Glassdoor on the Indispensable Skills for Data Scientists
We discuss Analytics at Glassdoor, important lessons, major factors affecting job satisfaction, challenges of working on Twitter Data, indispensable components of Data Science education.
on Apr 1, 2015 in Alessandro Gagliardi, Data Science Skills, Data Scientist, Glassdoor, Interview, Jobs, Prediction, Twitter
Gold Mine or Blind Alley? Functional Programming for Big Data & Machine Learning
Functional programming is touted as a solution for big data problems. Why is it advantageous? Why might it not be? And who is using it now?
on Apr 1, 2015 in Big Data, Functional Programming, Haskell, Zachary Lipton
A Data Scientist Advice to Business Schools
To remain relevant business school graduates must learn to speak to Data Scientists, whose domain expertise is playing a vital role in an organization's ability to compete in today's market.
on Apr 1, 2015 in Advice, Business Schools, Data Scientist, Sean McClure
|