Two good presentations
Two good recent presentations:
Pete Skomoroch’s tutorial on “geo analytics”. A really good slide deck that manages to introduce [Elastic] MapReduce, Pig, Amazon Mechanical Turk, Python Natural Language Toolkit, GitHub and probably other stuff I don’t remember as seamless components in a project in a way that makes complete sense. In fact I liked this much better than the same presenter’s talk at Strata 2011, maybe because this one goes more into details.
Similarly, I like Recorded Future developer Anders Karlsson’s presentation about how Recorded Future uses Amazon EC2 because it gives a lot of detail about nitty-gritty, day-to-day data analysis/storage challenges. Karlsson also provides some non-obvious thoughts on what works well, and less well, when trying to manage very large data sets on EC2.