Centrality indices

This vignette describes how to build different centrality indices on the basis of indirect relations as described in this vignette. Note, however, that the primary purpose of the netrankr package is not to provide a great variety of indices, but to offer alternative methods for centrality assessment. Nevertheless, the package also provides an Rstudio addin ‘index_builder()’, which allows to create and customize more than 20 different indices.


Theoretical Background

A one-mode network can be described as a dyadic variable x ∈ 𝒲𝒟, where 𝒲 is the value range of the network (in the simple case of unweighted networks 𝒲 = {0, 1}) and 𝒟 = 𝒩 × 𝒩 describes the dyadic domain of actors 𝒩.

Observed presence or absence of ties (the value range is binary) is usually not the relation of interest for network analytic tasks. Instead, mostly implicitly, relations are transformed into a new set of indirect relations on the basis of the observed relations. As an example, consider (shortest path) distances in the underlying graph. While they are fairly easy to derive from an observed network of contacts, it is impossible for actors in a network to answer the question “How far away are you from others you are not connected with?”. We denote generic transformed networks from an observed network x as τ(x).

With this notion of indirect relations, we can express all centrality indices in a common framework as $$ c_\tau(i)=\sum\limits_{t \in \mathcal{N}} \tau(x)_{it} $$ Degree and closeness centrality, for instance, can be obtained by setting τ = id and τ = dist, respectively. Others need several additional specifications which can be found in Brandes (2016) or Schoch & Brandes (2016).
With this framework, all centrality indices can be characterized as degree-like measures in a suitably transformed network τ(x). To build specific indices, we follow the analytic pipeline for centrality assessment: Observed network (x) → transformation (τ(x)) → aggregation (e.g.∑jτ(x)ij)


Building indices with the netrankr package

library(netrankr)
library(igraph)
library(magrittr)

The netrankr does, by design, not explicitly implement any centrality index. It does, however, provide a large set of components to create indices. Building an index based on an indirect relation, computed with indirect_relations(), is done with the function aggregate_positions().

The usual workflow is as follows:
g %>% indirect_relations() %>% aggregate_positions()
which is equivalent to aggregate_positions(indirect_relations(g)).
The former, however, comes with enhanced readability and is in accordance with the proposed analytic pipeline (see above).

aggregate_position() has a parameter type which is used to choose an appropriate aggregation method. Commonly, this is simply the sum operation.

data("dbces11")
g <- dbces11

V(g)$name <- 1:11

#Degree
g %>% 
  indirect_relations(type="adjacency") %>% 
  aggregate_positions(type="sum")
#Closeness
g %>% 
  indirect_relations(type="dist_sp") %>% 
  aggregate_positions(type="invsum")
#Betweenness Centrality
g %>% 
  indirect_relations(type="depend_sp") %>% 
  aggregate_positions(type="sum")
#Eigenvector Centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_limit_prop) %>% 
  aggregate_positions(type="sum")

For closeness type="invsum" is used since traditional closeness is defined as $$ c_c(i)=\frac{1}{\sum_t dist(i,t)}. $$ To obtain a slight variant of closeness, i.e. $$ c_c(i)=\sum_t \frac{1}{dist(i,t)}, $$ the following code can be used:

#harmonic closeness
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_inv) %>% 
  aggregate_positions(type="sum")

Indices based on shortest path distances constitute the biggest group of indices in the netrankr package.

#residual closeness (Dangalchev,2006)
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_2pow) %>% 
  aggregate_positions(type="sum")

#generalized closeness (Agneessens et al.,2017) (alpha>0)
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_dpow,alpha=2) %>% 
  aggregate_positions(type="sum")

#decay centrality (Jackson, 2010) (alpha in [0,1])
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_powd,alpha=0.7) %>% 
  aggregate_positions(type="sum")

#integration centrality (Valente & Foreman, 1998)
dist_integration <- function(x){
  x <- 1 - (x - 1)/max(x)
}
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_integration) %>% 
  aggregate_positions(type="sum")

The package implements several additional distance measures for networks, for which no index exists so far. Consult the help of indirect_relations() for possibilities.

Another large group of indices is based on walk counts.

#subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp) %>% 
  aggregate_positions(type="self")
#communicability centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp) %>% 
  aggregate_positions(type="sum")
#odd subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp_odd) %>% 
  aggregate_positions(type="self")
#even subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp_even) %>% 
  aggregate_positions(type="self")
#katz status
g %>% 
  indirect_relations(type="walks",FUN=walks_attenuated) %>% 
  aggregate_positions(type="sum")

Note: The analytic pipeline can of course be wrapped into a function.

degree_centrality <- function(g){
  DC <- g %>% 
    indirect_relations(type="adjacency") %>% 
    aggregate_positions(type="sum")
  return(DC)
}

Additionally, the Rstudio addin index_builder() provides a convenient way to produce the code for any desired index.