Follow the Data

A data driven blog

Archive for the day “February 27, 2017”

Dynamics in Swedish Twitter communities

TL;DR

I made a community decomposition of Swedish Twitter accounts in 2015 and 2016 and you can explore it in an online app.

Background

As reported on this blog a couple of months ago, (and also here). I have (together with Mattias Östmar) been investigating the community structure of Swedish Twitter users. The analysis we posted then addressed data from 2015 and we basically just wanted to get a handle on what kind of information you can get from this type of analysis.

With the processing pipeline already set up, it was straightforward to repeat the analysis for the fresh data from 2016 as soon as Mattias had finished collecting it. The nice thing about having data from two different years in that we can start to look at the dynamics – namely, how stable communities are, which communities are born or disappear, and how people move between them.

The app

First of all, I made an app for exploring these data. If you are interested in this topic, please help me understand the communities that we have detected by using the “Suggest topic” textbox under the “Community info” tab. That is an attempt to crowdsource the “annotation” of these communities. The suggestions that are submitted are saved in a text file which I will review from time to time and update the community descriptions accordingly.

The fastest climbers

By looking at the data in the app, we can find out some pretty interesting things. For instance, the account that easily increased to most in influence (measured in PageRank) was @BjorklundVictor, who climbed from a rank of 3673 in 2015 in community #4 (which we choose to annotate as an “immigration” community) to a rank of 3 (!) in community #4 in 2016 (this community has also been classified as an immigration-discussion community, and it is the most similar one of all 2016 communities to the 2015 immigration community.) I am not personally familiar with this account, but he must have done something to radically increase his reach in 2016.

Some other people/accounts that increased a lot in influence were professor Agnes Wold (@AgnesWold) who climbed from rank 59 to rank 3 in the biggest community, which we call the “pundit cluster” (it has ID 1 both in 2015 and 2016), @staffanlandin, who went from #189 to #16 in the same community, and @PssiP, who climbed from rank 135 to rank 8 in the defense/prepping community (ID 16 in 2015, ID 9 in 2016).

Some people have jumped to a different community and improved their rank in that way, like @hanifbali, who went from #20 in community 1 (general punditry) in 2015 to the top spot, #1 in the immigration cluster (ID 4) in 2016, and @fleijerstam, who went from #200 in the pundit community in 2015 to #10 in the politics community (#3) in 2016.

Examples of users who lost a lot of ground in their own community are @asaromson (Åsa Romson, the ex-leader of the Green Party; #7 -> #241 in the green community) and @rogsahl (#10 -> #905 in the immigration community).

The most stable communities

It turned out that the most stable communities (i.e. the communities that had the most members in common relative to their total sizes in 2015 and 2016 respectively) were the ones containing accounts using a different language from Swedish, namely the Norwegian, Danish and Finnish communities.

The least stable community

Among the larger communities in 2015, we identified the one that was furthest from having a close equivalent in 2016. This was 2015 community 9, where the most influential account was @thefooomusic. This is a boy band whose popularity arguably hit a peak in 2015. The community closest to it in 2016 is community 24, but when we looked closer at that (which you can also do in the app!), we found that many YouTube stars had “migrated” into 2016 cluster 24 from 2015 cluster 84, which upon inspection turned out to be a very clear Swedish YouTuber cluster with stars such as Clara Henry, William Spetz and Therese Lindgren.

So in other words, the The Fooo fan cluster and the YouTuber cluster from 2015 merged into a mixed cluster in 2016.

New communities

We were hoping to see some completely new communities appear in 2016, but that did not really happen, at least not for the top 100 communities. Granted, there was one that had an extremely low similarity to any 2015 community, but that turned out to be a “community” topped by @SJ_AB, a railway company that replies to a large number of customer queries and complaints on Twitter (which, by the way, makes it the top account of them all in terms of centrality.) Because this company is responding to queries from new people all the time, it’s not really part of a “community” as such, and the composition of the cluster will naturally change a lot from year to year.

Community 24, which was discussed above, was also dissimilar from all the 2015 communitites, but as described, we notice it has absorbed users from 2015 clusters 9 (The Fooo) and 84 (YouTubers).

Movement between the largest communities

The similarity score for the “pundit clusters” (community 1 in 2015 and community 1 in 2016, respectively) somewhat surprisingly showed that these were not very similar overall, although many of the top-ranked users are the same. A quick inspection also showed that the entire top list of community 3 in 2015 moved to community 1 in 2016, which makes the 2015 community 3 the closest equivalent to the 2016 community 1. Both of these communities can be characterized as general political discussion/punditry clusters.

Comparison: The defense/prepper community in 2015 vs 2016

In our previous blog post on this topic, we presented a top-10 list of defense Twitterers and compared that to a manually curated list from Swedish daily Svenska Dagbladet. Here we will present our top-10 list for 2016.

Username Rank in 2016 Rank in 2015 Community ID in 2016 Community ID in 2015
patrikoksanen 1 3 9 16
hallonsa 2 5 9 16
Cornubot 3 1 9 16
waterconflict 4 6 9 16
wisemanswisdoms 5 2 9 16
JohanneH 6 9 9 16
mikaelgrev 7 7 9 16
PssiP 8 135 9 16
oplatsen 9 11 9 16
stakmaskin 10 31 9 16

Comparison: The green community in 2015 vs 2016

One community we did not touch on in the last blog post is the green, environmental community. Here’s a comparison of the main influencers in that category in 2016 vs 2015.

Username Rank in 2016 Rank in 2015 Community ID in 2016 Community ID in 2015
rickardnordin 1 4 13 29
Ekobonden 2 1 13 109
ParHolmgren 3 19 13 29
BjornFerry 4 12 13 133
PWallenberg 5 12 13 109
mattiasgoldmann 6 3 13 29
JKuylenstierna 7 10 13 29
Axdorff 8 3 13 153
fores_sverige 9 11 13 29
GnestaEmma 10 17 13 29

Caveats

Of course, many parts of this analysis could be improved and there are some important caveats. For example, the Infomap algorithm is not deterministic, which means that you are likely to get somewhat different results each time you run it. For these data, we have run it a number of times and seen that you get results that are similar in a general sense each time (in terms of community sizes, top influencers and so on), but it should be understood that some accounts (even top influencers) can in some cases move around between communities just because of this non-deterministic aspect of the algorithm.

Also, it is possible that the way we use to measure community similarity (the Jaccard index, which is the ratio between the number of members in common between two communities and the number of members that are in any or both of the communities – or to put it in another way, the intersection divided by the union) is too coarse, because it does not consider the influence of individual users.

Advertisements

Post Navigation