Follow the Data podcast, episode 3: Grokking Big Data with Paco Nathan
In this third episode of the Follow the Data podcast we talk to Paco Nathan, Data Scientist at Concurrent Inc.
Paco’s blog: http://ceteri.blogspot.se/
The running time is about one hour.
Paco’s internet connection died just as we were about to start the podcast so he had to connect via Skype on the iPhone. We apologize on the behalf of his internet provider in Silicon Valley for the reduced sound quality caused by this.
Here’s a few links to stuff we discussed:
An application framework for Java developers to quickly and easily develop robust Data Analytics and Data Management applications on Apache Hadoop.
A dialect of Lisp that runs on the JVM.
A Scala library that makes it easy to write MapReduce jobs in Hadoop.
A simple command line interface for building large-scale data processing jobs based on Cascading.
states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: Consistency, Availability, Partition tolerance
an article on the USB-sized Oxford Nanopore MinION sequencer
Previously known as Data Without Borders this organisation aims to do good with Big Data.
Prediction based insurance for farmers.
An interesting take on how programming culture has affected life. Link to episode #2 (http://vimeo.com/29875053) “The use and abuse of vegetational concepts” – about how the idea of ecosystems came to be, sprung out of the notion of harmony in nature, how this influenced cybernetics and the perils of taking this animistic concept too far.
A great way to teach kids to code.
Another interesting tool for teaching kids to code and build games.
Free form virtual reality game.
Some info on arduino-based wireless wind measurement project by Karl-Petter Åkesson (in Swedish).
A pioneering internet retailer that Paco was one of the founders for.