21 December 2015 / #R #Temporal Networks

Temporal networks with R and igraph (updated)

A while ago, I wrote a post about how to create animations of temporal networks using R and the amazing package igraph package. The post was written in 2012 and the code does not work with the most recent versions (1.0) of igraph. Here I revisited that post, improving its performance and also making it consistent with the new versions of the package and R.

First of all, let me remind you the basic idea: we want to get an animated evolution of a network in which nodes/edges appear (and/or disappear) dynamically. We also want a “dynamical layout” for the temporal network in which the arrangement of the nodes and edges changes accordingly to the dynamics of the temporal network. In this post I will show you how to render the network at each time step and how to encode all snapshots into a video file using the igraph package in R and ffmpeg. The idea is very simple:

generate a number of snapshots of the network at different times using R and igraph, and
then put them together in a video file using the ffmpeg encoding tool

For the first part we need to draw the temporal network at each snapshot. Given the set of nodes and edges present at a given time, we have to find a layout for that instantaneous graph and plot the graph using that layout. There are many algorithms in igraph to do that, mainly, force based algorithms, which try to find the best disposition of nodes and edges for a given graph, typically starting from a random position. The problem is that from one snapshot to the following, the layout could vary significantly, producing a swarm-of-bees kind of motion when we put the snapshots together

The solution is then to evolve smoothly the layout from one snapshot to the following, by allowing only small changes to accommodate the changes in edges and nodes. To do that we need layout algorithms in which we can specify the initial positions of the nodes and let the algorithm evolve smoothly from that initial position. In igraph, this can be done for the Graphopt (layout_with_graphopt), Kamada-Kawai (layout_with_kk) and Fruchterman-reingold (layout_with_fr) algorithms using the coords or start argument:

require(igraph)
par(mfrow=c(2,2),mar=c(0,0,0,0), oma=c(0,0,0,0))
g = watts.strogatz.game(1,20,3,0.4)
layout.old = layout_with_fr(g)
for(i in 1:4){
  layout.new = layout_with_fr(g,niter=10,coords=layout.old,
                              start.temp=0.05,grid="nogrid")
  plot(g,layout=layout.new)
  layout.old = layout.new
}

As you can see the layouts are similar. There are three parameters passed to the layout function: niter = 10 which specifies the number of iterations (10) of energy minimization procedure in the forced based algorithm. This number should be small, otherwise the final result will be very different from the initial condition. The argument start.temp is the maximum amount of movement allowed along one axis, within one step, for a vertex and it should be kept small for the same reason. Finally, for performance issues, the Fruchterman-reingold algorithm might be implemented in a grid, something we prevent by using the grid=“nogrid” setting.

The second problem is that in a temporal network nodes and/or edges appear and disappear dynamically. Thus the time dependent graph might have different number of nodes and/or edges from one snapshot to the next one. This means that the layout at a given snapshot cannot be used as the initial condition to generate next time layout, since the number of nodes can be different. My approach here is very simple: consider all (past/present/future) nodes/edges and calculate the layout for all of them in each step, but considering only those edges which are present at a given time and displaying only nodes with at least one edge. This trick allows the reutilization of the layouts between steps. Furthermore, it will produce a layout in which present nodes are tightly connected, while past/future nodes are repelled from them. This effect dramatically highlights the appearance and disappearance of nodes, but could create too much confusion if there are many of those events.

To test this ideas, we will again work an important example in the theory of complex networks: the preferential attachment mechanism to generate scale-free networks, i.e. the Barabási-Albert model. In our implementation, we keep the mechanism very simple: starting from an initial core of nodes, at each time step we add a single node that connects to m existing nodes which are selected proportionally to the number of links that the existing nodes already have. This mechanism leads to heavily linked nodes (“hubs”) together with a large fraction of poorly connected nodes. A particular realization of this model can be found in the file edges.csv below. The structure of the file is simple

ff <- read.table("https://raw.githubusercontent.com/emoro/temporal_networks/master/edges.csv",header=T)
head(ff)

##   id1 id2 time
## 1   1   2    1
## 2   1   3    1
## 3   2   3    1
## 4   5   3    2
## 5   6   2    3
## 6   7   2    4

each line of the form id1 | id2 | time indicates that a link between id1 and id2 appears at a particular time. Depending on the context this might represent that the tie was activated at that particular instant (for example if it is a RT between two twitter accounts) or that it was the time in which the edge appeared first (like in our Barabási-Albert model).

Here is the code to generate the snapshots and producing a PNG picture for each of them

#this version of the script has been tested on igraph 1.0.1
#load libraries
require(igraph,RcolorBrewer)

#load the edges with time stamp
#there are three columns in edges: id1,id2,time
edges <- read.table("edges.csv",header=T)

#generate the full graph
g <- graph.data.frame(edges,directed=F)

#generate a cool palette for the graph (darker colors = older nodes)
YlOrBr.pal <- colorRampPalette(brewer.pal(8,"YlOrRd"))
#colors for the nodes are chosen from the very beginning
V(g)$color <- rev(YlOrBr.pal(vcount(g)))[as.numeric(V(g)$name)]

#time in the edges goes from 1 to 300. We kick off at time 3
ti <- 3
#remove edges which are not present
gt <- delete_edges(g,which(E(g)$time > ti))
#generate first layout using graphopt with normalized coordinates. This places the initially connected set of nodes in the middle. If you use fruchterman.reingold it will place that initial set in the outer ring.
layout.old <- norm_coords(layout.graphopt(gt), xmin = -1, xmax = 1, ymin = -1, ymax = 1)

#total time of the dynamics
total_time <- max(E(g)$time)
#This is the time interval for the animation. In this case is taken to be 1/10
#of the time (i.e. 10 snapshots) between adding two consecutive nodes
dt <- 0.1
#Output for each frame will be a png with HD size 1600x900 :)
png(file="animation/example%03d.png", width=1600,height=900)
#Time loop starts
for(time in seq(3,total_time,dt)){
  #remove edges which are not present
  gt <- delete_edges(g,which(E(g)$time > time))
  #with the new graph, we update the layout a little bit
  layout.new <- layout_with_fr(gt,coords=layout.old,niter=10,start.temp=0.05,grid="nogrid")
  #plot the new graph
  plot(gt,layout=layout.new,
       vertex.label="",vertex.size=1+2*log(degree(gt)),
       vertex.frame.color=V(g)$color,edge.width=1.5,
       asp=9/16,margin=-0.15)
  #use the new layout in the next round
  #use the new layout in the next round
  layout.old <- layout.new
}
dev.off()

As you can see the edges present before time ti=3 are considered as the initial seed for the animation. The rest of the edges are removed from the graph and the layout is calculated. At each time step in the loop the same procedure is followed: delete all edges with function delete_edges which are not present at time time, update the layout a little bit and plot the corresponding graph. Note that the size of the vertices is log-proportional to their degree, which means that if there is no edge incident to a node, the size of the node is -Inf and it is not displayed. This way of hiding nodes can be change to be more elegant, but it does the trick here.

After running the script above you will end up with a number of files named example001.png, example002.png and so on. To encode these images into a video format you can use the ffmpeg tool which can be install in linux, windows or mac. The following command line in a terminal shell produces a video file output.mp4 in the mpeg format:

ffmpeg -r 10 -i example%03d.png -b:v 20M output.mp4

The first -r 10 flag controls the rate of frames per second (fps), 10 in this case, while the -b:v 20M sets the bitrate in the output (set to a large value here, 20M). The result is the following video

This is it! Done with 17 lines in R and updated to the last version of igraph (1.0). I am eager to know your comments. Please!

The scripts and data can also be found at https://github.com/emoro/temporal_networks

Temporal networks with R and igraph (updated)

2015 Pictures

Video of my talk “Your data, our society"