Follow the Data

A data driven blog

Games and competitions as research tools

The first high-profile paper describing crowdsourced research results has just been published in Nature. (I am excluding things like folding@home from consideration here, since in those cases the crowds are donating their processor cycles rather than their brainpower.) The paper describes how the game FoldIt (which I blogged about roughly a year ago) was used to refine predicted protein structures. This is an excerpt from the abstract:

Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

So in other words, FoldIt tries to capitalize on intuitive or implicit human problem-solving skills to complement brute-force computational algorithms. Interestingly, all FoldIt players are credited as co-authors of the Nature, so technically I could count myself as one of them, seeing that I gave the game a try last year. (It’s a lot of fun, actually.)

I think games and competitions (which are almost the same thing, really) will soon be used a lot more than they are today in scientific research (and of course other areas like productivity, innovation and personal health management, too.) The Kaggle blog had an interesting post about competitions as real-time science. In a short time, Kaggle has set up several interesting prediction contests. The Eurovision Song Contest and Football World Cup contests were, I guess, mostly for fun. The interesting thing about the latter one, though, was that it was set up as a “Take on the quants” contest, where quantitative analysts from leading banks were pitted against other contestants – and they did terribly. Now the quants have a chance to redeem themselves in the INFORMS challenge, which is about their specialty area – stock price movements …

Anyway … the newest Kaggle contest is very interesting for me as a chess enthusiast. It is an attempt to improve on the age-old (well … I think it was introduced in the late 1960s) Elo rating formula, which is still used in official chess ranking lists. This system was invented by a statistician, Arpad Elo, based mostly on theoretical considerations, but it has done its job OK. The Elo ratings should ideally be able to predict results of games with a reasonable accuracy (as an aside, people have also often tried to use it to compare players from different epochs to each other, which is a futile exercise, but that’s a topic for another post), but where it really does that has not been very thoroughly analyzed. The Elo system also has some less well understood properties like an apparent “rating inflation” (which may or may not be an actual inflation). Some years ago, a statistician named Jeff Sonas started to develop his own system that he claimed was able to predict results of future games more accurately.

Now, Sonas (with Kaggle) has taken the next step, which is to arrange a competition to see if this will yield an even better system. The competitors get results of 65,000 recent games by top players and attempt to predict the outcome of a further 7,809 games. At the time of writing, there are already two rating systems that are doing better than Elo (see the leaderboard).

By the way, if you think chess research is not serious enough, Kaggle also has a contest about predicting HIV progression. I’m sure they have other scientific prediction contests lined up (I’ve noticed a couple of interesting – and lucrative – ones at Innocentive too.)

Single Post Navigation

Leave a comment