It gets lonely in the data mines
From the rather interesting BERG blog, a long and thoughtful post on exploring data using (programming) code in a process called material exploration. In this case, the data exploration is done in the context of developing a new information system called Ashdown, which is related to the British education system. The author of the blog post argues that in data-exploration projects, code becomes “... a sculpting tool, rather than an engineering material“.
BERG has done material explorations before – they were a big part of our Nokia Personalisation project, for instance – and the value of them is fairly immediate when the materials involved are things you can touch.
But Ashdown is a software project for the web – its substrate is data. What’s the value of a material exploration with an immaterial substrate? What does it look like to perform such explorations?
There are problems and risks in data exploration. Of course, there is the exploration vs. exploitation dilemma, which turns up in e.g. reinforcement learning – how should one balance the desire to explore every nook and cranny of the data space versus the ability to recognize a good-enough strategy and put all effort into fine-tuning it?
The author hints at risks related to becoming overwhelmed, even possessed by the data:
[The] dataset – its meaning, its structure – gets stuck in your head, and it’s easy to lose yourself to it. That often makes it harder to explain to others – you start talking in a different langauge – so it becomes critical to get it out of your head and onto screens.
It also feels lonely in the data-mines at times. Not because you’re the only person working on it, but because no-one else can speak the language you do; the deeper you get into the data, the harder you have to work to communicate it, and the quicker you forget how little anyone else on the project knows.