followthedata blog report on O'Reilly Strata Data Conference
February 3, 2011, by Mikael Huss
This was the first day of the conference proper, with keynotes and other activities extending throughout the day until 9.20pm. It was a jam-packed day for sure. People were crowding around demonstration booths and bars, and the notice board was full of job openings for data scientists. The big data field seems to be on a roll - at least for now.
The keynotes are already available on O'Reilly's YouTube channel. The keynotes I enjoyed the most were those by Hilary Mason and Mike Madsen. Hilary Mason came up with two "Strata memes" that people seemed to pick up on; (a) how nice it is to be able to spin up computing clusters while at home in your underwear and (b) that we have enough ad optimizing algorithms already - it's time to use analytics for stuff that is actually important.
Mark Madsen drew parallels between the current data hype and the gold rush of the 19th century, calling the image of the lone developer with a PC mining terabytes of data a myth. It's actually changes in business processes - according to him - that will lead to the really disruptive changes. He also talked about how "software eating itself" and how "code is a commodity". Not sure I agree but worth thinking about. Like JC Herz yesterday, Madsen cautioned against pretty visualizations with no use case, urging us not to "become the tabloid journalists of the data industry." He also said that the one paper you definitely have to read if you want to become a useful data scientist is The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis by Pirelli and Card (2005.)