Follow the Data

A data driven blog

Archive for the tag “social-data”

Data services

There’s been a little hiatus here as I have been traveling. I recently learned that Microsoft has launched Codename “Dallas”, a service for purchasing and managing datasets and web services. It seems they are trying to provide consistent APIs to work with different data from the public and private sectors in a clean way. There’s an introduction here.

This type of online data repository seems to be an idea whose time has arrived – I have previously talked about resources like Infochimps, Datamob and Amazon’s Public Data Sets, and there is also theinfo.org, which I seem to have forgotten to mention. A recent commenter on this blog pointed me to the comprehensive knowledge archive network, which is a “registry of open data and content packages”. Then there are the governmental and municipal data repositories, such as data.gov.

Another interesting service, which may have a slightly different focus, is Factual, described by founder Gil Elbaz as a “platform where anyone can share and mash open data“. Factual basically wants to list facts, and puts the emphasis on data accuracy, so you can express opinions on and discuss the validity of any piece of data. Factual also claims to have “deeper data technology” which allows users to explore the data in a more sophisticated way compared to other services like the Amazon Open Data Sets, for instance.

Companies specializing in helping users make sense out of massive data sets are, of course, popping up as well. I have previously written about Good Data, and now the launch of a new seemingly similar company,  Data Applied, has been announced.  Like Good Data, Data Applied offers affordable licenses for cloud-based and social data analysis, with a free trial package (though Good Data’s free version seems to offer more – a 10 MB data warehouse and 1-5 users vs Data Applied’s file size of <100 kb for a single user; someone correct me if I am wrong). The visualization capabilities of Data Applied do seem very nice. It’s still unclear to me how different the offerings of these two companies are but time will tell.

Advertisements

… but does it work?

Upon learning about the possible future of medicine, including self-tracking and social networks for patients, you might wonder whether these things really work or if they are just nice ideas. Well, now there are at least some indications that they are useful.

The Decision Tree quotes a report from the the Kaiser Permanente Center for Health Research which says that people who kept track of how much food they ate lost twice as much weight as people who didn’t in a study about weight loss. As a press release puts it, “It seems that the simple act of writing down what you eat encourages people to consume fewer calories.” The study was published in the American Journal of Preventive Medicine in August. However, the effect is much more powerful if people exercise and self-track together, in line with ideas about “social contagion” that I have discussed before.

(By the way, the last several posts at The Decision Tree about the Health2.0 conference and the Kaiser Permanente’s HealthCamp “unconference” make for interesting reading about everything that is brewing in this field.)

What about “social medicine”? A couple of weeks ago, Alexandra Carmichael from CureTogether gave a talk at the Mayo Clinic and revealed that the company has achieved its first statistically significant finding based solely on self-reported data from users of their site. According to these “patient-generated data”, people with infertility are twice as likely to have asthma. This has also been found before in controlled clinical studies. It’s a sign that aggregated self-reported data has the potential to uncover many known and unknown correlations.

On the theme of patient data, there is a new piece, Owning Your Health Information – An Inalienable Right by Leslie Saxon, which is worth a read.

And a final reading tip: a Nature opinion piece (with Craig Venter as one of the authors) called An agenda for personalized medicine. It discusses how disease risk assessments obtained from two direct-to-consumer genetic testing companies can vary quite a lot in some cases. The authors suggest a number of “best practices” to improve disease risk predictions.

From social atoms to aggregates

In a recent Perspectives article in Science (subscription required, unfortunately), Alessandro Vespignani lays out a research program that recalls Hari Seldon, the master statistician in Asimov’s Foundation series who has managed to develop a discipline called psychohistory, with which he can predict the future in probabilistic terms.

A huge flow of quantitative data that combine the demographic and behavioral aspects of society with the infrastructural substrate is becoming available […]. Analogously to what happened in physics, we are finally in the position to move from the analysis of the “social atom” or “social molecules” (i.e., small social groups) to the quantitative analysis of social aggregate states, as envisioned by social scientists at the beginning of the past century […]. Here, I refer to “social aggregate states” as large-scale social systems consisting of millions of individuals that can be characterized in space (geographic and social) and time. The shift from the study of a small number of elements to the study of the behavior of large-scale aggregates is equivalent to the shift from atomic and molecular physics to the physics of matter.

Vespignani goes on to briefly discuss reality mining, multiscale modelling and what he calls “network thinking”. He argues that if we succeed in ” […] the gathering of large-scale data on information spread and social reactions that occur during periods of crisis”,  ” […] the formulation of formal models that make it possible to quantify the effect of risk perception and awareness phenomena of individuals on the techno-social network structure and dynamics.” and ” […] the deployment of monitoring infrastructures capable of informing computational models in real time.”, we can

imagine the creation of computational forecasting infrastructures that will help us design better energy-distribution systems, plan for traffic-free cities, anticipate the demands of Internet connectivity, or manage the deployment of resources during health emergencies.

The article is part of a special issue on “Complex systems and networks“, and there is additional interesting material there about  econophysics, meta-network analysis, scale-free networks and other topics.

Patient social networks

Online social networks for patients would seem to be rife with potential for medical discovery. Given a critical mass of patients who are communicating with each other and with a (morally responsible) service provider, there should be opportunities for e.g. relating disease progression to lifestyle, demographics, etc.,  for discovering unexpected relationships between different diseases, and perhaps for enabling a more precise categorization of diseases. Of course, patients would also be highly motivated to help in the development of new drugs for their particular disease, although it’s not clear how this motivation should best be leveraged.

CureTogether is an interesting effort in this direction. It describes itself using the term Open Source Health Research and aims to enable patients to learn from their peers and to get personalized health information. CureTogether also allows patients (or anyone, really) to participate in and even fund research directed toward a specific disease.

In addition, CureTogether has released a couple of “crowdsourced” books which I think are pretty interesting. Migraine Heroes contains stories and data from 271 migraine patients who describe, in their own words, symptoms, side effects of treatments and so on. The information collected for the book also suggested novel and surprising co-morbid conditions (secondary diseases or disorders in addition to migraine). Endometriosis Heroes is another book in the same vein.

Patients Like Me is another online social network for patients. It was pretty extensively discussed in an Interviews with Innovators podcast, from which I have borrowed a couple of quotes. Basically, Patients Like Me allows patients to share data about their diseases, the drugs they are taking, side effects from treatments and so on. The company makes money by selling this data (in anonymized form) to drug companies, which hopefully leads to a win-win situation where the drug companies get relevant information and the patients get better drugs. The company “puts patients in direct contact with the companies who can help them“, as Jamie Heywood put it in the podcast linked above.

Patients Like Me provides “structured data for illnesses” – a common format for describing different illnesses in a similar way. This is of course crucial for computer readability and efficient statistical analysis. The company sees itself as “training a group of expert consumers to do evidence based decisions” through their service.

An interesting tidbit from the company’s web site:

Future state modeling – Simply “tracking” a patient’s progression has never been the goal for us; we’ve always wanted to take past information and use it to predict the future state of an individual patient. In relatively linear diseases like ALS, that means we can help patients to plan in advance for when they might need a wheelchair or other equipment. It’s often the case that ALS/MND patients don’t get the equipment they need until several months after they could have benefited from having it. Such a tool would give a customized prediction for the individual patient. After all, most of us don’t want to know about the “average” patient, we want to know about a “patient like me”!

Ask a stranger

Ever think about what career you would really be suited for? I know I have. Unfortunately, we humans seem to be really bad at predicting what will make us happy, and according to psychologist Dan Gilbert, when you are faced with a choice, you would be better off asking unknown people who have been in a similar situation instead of listening to your gut instinct.

In the same vein, perhaps we can try asking strangers about our career choices? Path101 is an interesting career site that puts you through a personality test and then suggests careers for you based on the character traits the analysis engine estimates from your answers. In addition, you can get career advice by posting questions, anonymously or openly, to other users. The company calls this community powered career discovery.

Path101 will also analyze your resume and compare it to a database of millions (!) of resumes they have collected. The site delivers lots of interesting statistics about what personality traits tend to be correlated to which jobs, to which new careers people in a certain job tend to go, and so on. There is a lot more to be found if you poke around the site. The IT Conversations podcast did a nice interview with the founders of the company.

A similar though perhaps more light-hearted service is hunch.com (not to be confused with an excellent machine learning site called hunch.net!), which helps you make choices (big or small) by putting you through a quiz about the choice in question. After completing the quiz, you get a recommendation about how to choose. Apparently, Hunch has some sort of algorithm that learns about you while you use the site, so that you get progressively shorter quizzes before the system recommends a decision.

An interesting thing about Hunch is that it has an API, so you can integrate it into your own applications. I’m not sure what kind of application it would be useful for, but presumably someone will figure it out soon!

Post Navigation