Follow the Data

A data driven blog

Archive for the month “August, 2013”

Healthcare and data

At the Digital Health Days in Stockholm last week, the selected themes were mobile solutions, gamification, big data and quantified self. This meant that the meeting was quite diverse, and that the talks were a bit superficial in order not to lose the audience. Still, I think it was quite nice overall, and I hope for more focused follow-up meetings that explore more defined areas in depth.

In the workshop/panel discussion on “big data in healthcare and life sciences”, we had a good discussion where the grand visions of data enabled health solutions collided with the miserable realities of healthcare IT in Sweden. Apparently there are around a thousand different data silos in Stockholm for electronic medical records and the like distributed across around 1,500 care providers, and these systems cannot communicate with each other. A single hospital, the Karolinska University Hospital, has 273 systems, and a patient might have data in 20 of those systems!

Kairos Future, a consultancy specializing in strategic future analysis (or something like that) recently released a free report called The Data Explosion and the Future of Health. I think it’s well worth a read (although I have some minor quibbles with some of the points made) and more realistic than the McKinsey big data reports in that it shows more awareness of difficulties and possible risks.

I also met a representative of a new company for our list of big data companies in Sweden: Malmö and San Francisco based Experlytics, which wants to “help revolutionize medicine by making accurate predictions, finding cures to intractable diseases, optimize drug administration and increase the general knowledge in medicine” and “… believe the vision will be achieved by the utilization of data capture and artificial intelligence technologies.”


Spotify music graph meetup

Continuing on the theme of graphs from my last post, I went to Spotify’s Music Graph Tech Talk in Stockholm the other day. It turns out that Spotify has recently started a dedicated group for graph engineering, called the “graph squad” (they are currently hiring for senior graph specialist and graph software engineer roles), which is busy evaluating different options for storing and manipulating the world’s most comprehensive music graph.

It was rather fascinating to hear Jon Åslund describe the various distinctions between “tracks”, “songs” and “recordings”, “albums”, “releases” and “release groups”, and the almost-but-not-quite-perfect ISRC codes for uniquely identifying tracks.

There were two interesting presentations by people from Neo4j, one of which is located here: Graphs for bunnies. It gives an introduction to graph databases.

Anders Arpteg, the squad leader of the graph group, mentioned that they have existed for a couple of months and are still trying out things like Neo4j and Giraph for handling the music graph. He gave some (“slightly outdated”) numbers that I failed to write down, but I think he said there are something like 20 million active users and 5 Tb of event data are recorded each day. I read from another source that Spotify has the largest commercially used Hadoop cluster in Europe (700 nodes) although I don’t know if that is used for the graph processing.

All in all, it was a good event which was made even better by free hamburgers and beer.

Post Navigation