Statistics
| Branch: | Revision:

root / thesis / ch3_computation_of_bc.tex @ df1b1a42

History | View | Annotate | Download (6.79 KB)

1
%!TEX root = main.tex
2

    
3
\chapter{Computation of Betweenness Centrality} \label{sec:computation_betweenness}
4
    Recall the definition of betweenness centrality:
5

    
6
    \begin{equation*}
7
        \label{eq:betweenness_centrality}
8
        c_B(v) = \sum_{s \neq t \neq v \in V} \delta_{st}(v) = \sum_{s \neq t \neq v \in V} \frac{\sigma_{st}(v)}{\sigma_{st}}
9
    \end{equation*}
10

    
11
    Traditionally, just by looking at the equation, we normally divide the task of computing betweenness centrality into 2 steps:
12

    
13
    \begin{itemize}
14
        \item First we need to compute tables with length and number of shortest paths between all pairs of vertices in the network.
15
        \item Second for each vertex $v$, we need to consider all the pair $s, t$, we use the tables calculated above to determine the fraction of shortest paths between $s, t$ that are passing through $v$. Summing all the fraction for every possible $s, t$ and we arrive at the $c_B(v)$. This way of summing is referred to by the term \emph{naive summation of pair-dependencies}
16
    \end{itemize}
17

    
18
    The chapter starts with presenting some important concept in \cref{sec:computation_important_concepts}. Then \cref{sec:computation_counting_shortest_paths} gives an overview on classic ways to find the shortest paths. And \cref{sec:computation_summing_pair_dependencies} introduces 2 ways to sum all the pair-dependencies: (i) the naive way, (ii) the faster approach to accummulate pair-dependencies gradually by \cite{Brandes01afaster}. Up until now, the approach in \cite{Brandes01afaster} is the state-of-the-art in calculating the exact betweenness centrality, and it is implemented in many standard graph libraries such as
19
        \emph{networkx} \footnote{https://networkx.github.io/},
20
        \emph{Boost Graph Library} \footnote{www.boost.org/libs/graph/doc/}
21
    The \cref{sec:computation_summary} intends to summarize the result of two previous section with \cref{tab:complexity_of_algorithms} recaping the complexity of some algorithms used in determining betweenness centrality.
22

    
23
    After having a throughout background, \cref{sec:computation_heuristic} presents the heuristic way to calculate also the exact betweenness centrality by \cite{PuzisZEDB12}. However, the complexity of the Heuristic Betweenness Centrality is not guaranteed to perform better than the algorithm of \cite{Brandes01afaster}, its performance depends on the topology.
24

    
25
\section{Important concepts} \label{sec:computation_important_concepts}
26
    Define the set of \emph{predecessors} of a vertex $v$ on the shortest path from $s$ as:
27

    
28
    \begin{equation} \label{eq:predecessor}
29
        P_s(v) = \{ u \in V : {u, v} \in E, d(s, v) = d(s, u) + \omega(u, v) \}
30
    \end{equation}
31

    
32
    where $d(s,u)$ is the minimum length  of any path connecting $s$ and $u$. And $\omega(u, v)$ is the weight of the edge ${u, v}$.
33

    
34
    Given $P_s(v)$, then we have the following relation:
35
    \begin{equation}
36
        \sigma_{st} = \sum_{u \in P_s(v)}{\sigma_{su}}
37
    \end{equation}
38

    
39

    
40
\section{Counting the number of shortest paths} \label{sec:computation_counting_shortest_paths}
41

    
42
    Before we begin, there are 2 types of shortest paths problem. The first one is called Single Source Shortest Paths (SSSP), where we compute the shortest paths between one given source vertex $s$ to all other vertices. The second one is All-Pairs Shortest Paths (APSP) where we compute the shortest paths between all vertex pairs.
43

    
44
    The SSSP is a classical problem in Graph Theory, and it is usually solved by Breadth-First Search (BFS) for unweighted graphs in $\mathcal{O}(m + n)$, and Dijkstra's algorithm for the weighted graphs in $\mathcal{O}(m + n \log n)$.
45

    
46
    The APSP can be computed straightforwardly by applying algorithm for SSSP $n$ times, for all vertex $s \in V$. The resulting time complexity thus will be $\mathcal{O}(mn + n^2)$, and $\mathcal{O}(mn + n^2 \log n)$ respectively. In the sparse graph, it is a good practice to solving the SSSP multiple times. However, in non-sparse graph, where the number of edges is close to the maximal number of edges, i.e. $m \simeq n(n-1) \simeq n^2$, then it is better to use Floyd/Warshall algorithm with the complexity of $\mathcal{O}(n^3)$.
47

    
48

    
49
    \textbf{XXX What is the input}: the source vertex $s$
50
    \textbf{XXX What is the output?}:
51

    
52
    \subsection{Breadth-First Search for unweighted graph}
53
        I understood the algorithm, but how can I explain it to another person?
54

    
55

    
56

    
57
    \subsection{Dijkstra's algorithm for weighted graph}
58
        The Dijkstra's algorithm can be modified so that when given a source vertex $s$, the output will be the set of predecessors and the number of shortest paths from $s$ to other $t \neq s \in V$
59

    
60
        \begin{algorithm}
61
        \caption{Dijkstra's algorithm} \label{algo:dijkstra}
62
        \begin{algorithmic}[1]
63
            \Procedure{Dijkstra for SSSP with source vertex s}{}
64
                \State $S \leftarrow$ empty stack
65
                \State $P[w] \leftarrow$ empty list, $w \in V$
66
                \State $\sigma[t] \leftarrow 0, t \in V$; $\sigma[s] \leftarrow 1$
67
                \State $d[t] \leftarrow -1, t \in V$; $d[s] \leftarrow 0$
68
                \State $Q \leftarrow $ empty queue
69
                \State enqueue $s \rightarrow Q$
70
                \While{$Q$ \emph{not empty}}
71
                    dequeue $v \leftarrow Q$
72
                    push $v \rightarrow S$
73

    
74
                \EndWhile
75
            \EndProcedure
76
        \end{algorithmic}
77
        \end{algorithm}
78

    
79

    
80
\section{Summing all the pair-dependencies} \label{sec:computation_summing_pair_dependencies}
81
    \subsubsection{Naive ways}
82
    \subsubsection{Gradually accummulation of pair-dependencies} \label{sec:computation_accumulation_pair_dependencies}
83

    
84

    
85
\section{Complexity of determining betweenness centrality as a whole} \label{sec:computation_summary}
86

    
87
     \begin{table} [h]
88
        \caption{Complexity for the 1st step: counting the number and length of shortest paths}
89
        \label{tab:complexity_of_algorithms}
90
        \begin{tabular}{l|l|l}
91
            \toprule
92
            Algorithm name & Unweighted graphs & Weighted graph \\
93
            \midrule
94
            Breadth-First Search & $\mathcal{O}(mn + n^2)$ & not applicable\\
95
            Dijkstra &  & $\mathcal{O}(mn + n^2 \log n)$ \\
96
            Floyd/Warshall & & $\mathcal{O}(n^3)$ \\
97
            \bottomrule
98
        \end{tabular}
99
    \end{table}
100

    
101
    \begin{table} [h]
102
        \caption{Complexity for the 2nd step: summing all pair-dependencies}
103
        \label{tab:complexity_of_algorithms}
104
        \begin{tabular}{l|l|l}
105
            \toprule
106
            Algorithm name & Complexity \\
107
            \midrule
108
            Naive summation & $\mathcal{O}(n^3)$ \\
109
            \cite{Brandes01afaster} &  $\mathcal{O}(mn)$\\
110
            \bottomrule
111
        \end{tabular}
112
    \end{table}
113

    
114
\section{Heuristic algorithm for the exact betweenness centrality} \label{sec:computation_heuristic}