-
Must-Know: What are common data quality issues for Big Data and how to handle them?
Let's have a look at common quality issues facing Big Data in terms of the key characteristics of Big Data – Volume, Velocity, Variety, Veracity, and Value.
-
Must-Know: When can parallelism make your algorithms run faster? When could it make your algorithms run slower?
Efficient implementation is key to achieving the benefits of parallelization, even though parallelism is a good idea when the task can be divided into sub-tasks that can be executed independent of each other without communication or shared resources.
-
Must-Know: Why it may be better to have fewer predictors in Machine Learning models?
There are a few reasons why it might be a better idea to have fewer predictor variables rather than having many of them. Read on to find out more.
-
Key Takeaways from Strata + Hadoop World 2017 San Jose, Day 1
The focus is increasingly shifting from storing and processing Big Data in an efficient way, to applying traditional and new machine learning techniques to drive higher value from the data at hand.
-
Businesses Will Need One Million Data Scientists by 2018
Deepening shortage of Data Science talent and cybersecurity challenges are trends shaping business in 2016.
-
Infographic – Data Scientist or Business Analyst? Knowing the Difference is Key
Infographic depicting unique differences between data scientists and business analysts. Find out what type of professional is needed to meet your organization’s needs.
-
Interview: Thanigai Vellore, Art.com on Delivering Contextually Relevant Search Experience
We discuss the role of Analytics at Art.com, the polyglot data architecture at Art.com, the use cases for Hadoop, vendor selection, supporting semantic search and experience with Avro.
-
Interview: Joseph Babcock, Netflix on Genie, Lipstick, and Other In-house Developed Tools
We discuss role of analytics in content acquisition, data architecture at Netflix, organizational structure, and open-source tools from Netflix.
-
Interview: Joseph Babcock, Netflix on Discovery and Personalization from Big Data
We discuss the steps involved in Discovery process at Netflix, impact due to multitude of devices, system generated logs, and surprising insights.
-
Top 10 R Packages to be a Kaggle Champion
Kaggle top ranker Xavier Conort shares insights on the “10 R Packages to Win Kaggle Competitions”.
|