Are we ready for a true data disaster? Interesting Infoworld article that talks about possibilities for devastating “data spills” that could have effects as bad as the oil spill, or worse.
Monkey Analytics – a “web based computation tool” that lets users run R, Python and Matlab commands in the cloud.
Blogs and tweets could predict the future. New Scientist article that mentions Google’s study from last year where they tried to use search data to predict various economic variables. A lot of organizations have seized upon that idea, and lately we have seen examples such as Recorded Future, a company that attempts to “mine the future” using future-related online text sources. Google famously used the “predictions from search data” idea to predict flu outbreaks. One of the interesting things here, I think, is that people’s searches (which could be viewed naïvely as ways to obtain data) actually become data in themselves; data that can be used as predictors in a statistical models. The Physics of Data is an interesting video where Google’s Marissa Mayer talks about this topic and a lot of other googly stuff (I don’t really get the name of the presentation though, despite her attempt to justify it in the beginning …).
Wikiposit aims to be a “Wikipedia of numerical data.” It aggregates thousands of public data sets (currently 110,000) into a single format and offers a simple API to access them. As of now, it only supports time series data, mostly from the financial domain.