Gavagai, which I mentioned in a post about Swedish big data companies earlier this year, seems to have come out of stealth mode. They have launched a blog and have started to talk about their Ethersource technology for text analysis. Looking at the use cases in the blog, one gets the impression of a kind of sentiment analysis engine on steroids (although the site also mentions data journalism and author profiling as example applications), but in fact the technology is much more interesting than that. As the Ethersource page describes, the system does not use any pre-existing knowledge but rather attempts to learn concepts from strings of symbols (or “computes and tracks relations between terms in symbols in streaming language data” as the page also has it.) This also makes the platform more or less language-agnostic and thus suitable for multilingual analysis. Of course (?), the system is constantly learning, rather than being “trained” and updated in discrete jumps.
Gavagai’s blog contains some case studies where online social media have been monitored using the system in order to understand customer loyalty of US mobile network operators, reactions to Rick Perry’s botched interview, the apparently fading interest in Julian Assange, etc.