## root / thesis / ch3_computation_of_bc.tex @ df1b1a42

History | View | Annotate | Download (6.79 KB)

1 |
%!TEX root = main.tex |
---|---|

2 | |

3 |
\chapter{Computation of Betweenness Centrality} \label{sec:computation_betweenness} |

4 |
Recall the definition of betweenness centrality: |

5 | |

6 |
\begin{equation*} |

7 |
\label{eq:betweenness_centrality} |

8 |
c_B(v) = \sum_{s \neq t \neq v \in V} \delta_{st}(v) = \sum_{s \neq t \neq v \in V} \frac{\sigma_{st}(v)}{\sigma_{st}} |

9 |
\end{equation*} |

10 | |

11 |
Traditionally, just by looking at the equation, we normally divide the task of computing betweenness centrality into 2 steps: |

12 | |

13 |
\begin{itemize} |

14 |
\item First we need to compute tables with length and number of shortest paths between all pairs of vertices in the network. |

15 |
\item Second for each vertex $v$, we need to consider all the pair $s, t$, we use the tables calculated above to determine the fraction of shortest paths between $s, t$ that are passing through $v$. Summing all the fraction for every possible $s, t$ and we arrive at the $c_B(v)$. This way of summing is referred to by the term \emph{naive summation of pair-dependencies} |

16 |
\end{itemize} |

17 | |

18 |
The chapter starts with presenting some important concept in \cref{sec:computation_important_concepts}. Then \cref{sec:computation_counting_shortest_paths} gives an overview on classic ways to find the shortest paths. And \cref{sec:computation_summing_pair_dependencies} introduces 2 ways to sum all the pair-dependencies: (i) the naive way, (ii) the faster approach to accummulate pair-dependencies gradually by \cite{Brandes01afaster}. Up until now, the approach in \cite{Brandes01afaster} is the state-of-the-art in calculating the exact betweenness centrality, and it is implemented in many standard graph libraries such as |

19 |
\emph{networkx} \footnote{https://networkx.github.io/}, |

20 |
\emph{Boost Graph Library} \footnote{www.boost.org/libs/graph/doc/} |

21 |
The \cref{sec:computation_summary} intends to summarize the result of two previous section with \cref{tab:complexity_of_algorithms} recaping the complexity of some algorithms used in determining betweenness centrality. |

22 | |

23 |
After having a throughout background, \cref{sec:computation_heuristic} presents the heuristic way to calculate also the exact betweenness centrality by \cite{PuzisZEDB12}. However, the complexity of the Heuristic Betweenness Centrality is not guaranteed to perform better than the algorithm of \cite{Brandes01afaster}, its performance depends on the topology. |

24 | |

25 |
\section{Important concepts} \label{sec:computation_important_concepts} |

26 |
Define the set of \emph{predecessors} of a vertex $v$ on the shortest path from $s$ as: |

27 | |

28 |
\begin{equation} \label{eq:predecessor} |

29 |
P_s(v) = \{ u \in V : {u, v} \in E, d(s, v) = d(s, u) + \omega(u, v) \} |

30 |
\end{equation} |

31 | |

32 |
where $d(s,u)$ is the minimum length of any path connecting $s$ and $u$. And $\omega(u, v)$ is the weight of the edge ${u, v}$. |

33 | |

34 |
Given $P_s(v)$, then we have the following relation: |

35 |
\begin{equation} |

36 |
\sigma_{st} = \sum_{u \in P_s(v)}{\sigma_{su}} |

37 |
\end{equation} |

38 | |

39 | |

40 |
\section{Counting the number of shortest paths} \label{sec:computation_counting_shortest_paths} |

41 | |

42 |
Before we begin, there are 2 types of shortest paths problem. The first one is called Single Source Shortest Paths (SSSP), where we compute the shortest paths between one given source vertex $s$ to all other vertices. The second one is All-Pairs Shortest Paths (APSP) where we compute the shortest paths between all vertex pairs. |

43 | |

44 |
The SSSP is a classical problem in Graph Theory, and it is usually solved by Breadth-First Search (BFS) for unweighted graphs in $\mathcal{O}(m + n)$, and Dijkstra's algorithm for the weighted graphs in $\mathcal{O}(m + n \log n)$. |

45 | |

46 |
The APSP can be computed straightforwardly by applying algorithm for SSSP $n$ times, for all vertex $s \in V$. The resulting time complexity thus will be $\mathcal{O}(mn + n^2)$, and $\mathcal{O}(mn + n^2 \log n)$ respectively. In the sparse graph, it is a good practice to solving the SSSP multiple times. However, in non-sparse graph, where the number of edges is close to the maximal number of edges, i.e. $m \simeq n(n-1) \simeq n^2$, then it is better to use Floyd/Warshall algorithm with the complexity of $\mathcal{O}(n^3)$. |

47 | |

48 | |

49 |
\textbf{XXX What is the input}: the source vertex $s$ |

50 |
\textbf{XXX What is the output?}: |

51 | |

52 |
\subsection{Breadth-First Search for unweighted graph} |

53 |
I understood the algorithm, but how can I explain it to another person? |

54 | |

55 | |

56 | |

57 |
\subsection{Dijkstra's algorithm for weighted graph} |

58 |
The Dijkstra's algorithm can be modified so that when given a source vertex $s$, the output will be the set of predecessors and the number of shortest paths from $s$ to other $t \neq s \in V$ |

59 | |

60 |
\begin{algorithm} |

61 |
\caption{Dijkstra's algorithm} \label{algo:dijkstra} |

62 |
\begin{algorithmic}[1] |

63 |
\Procedure{Dijkstra for SSSP with source vertex s}{} |

64 |
\State $S \leftarrow$ empty stack |

65 |
\State $P[w] \leftarrow$ empty list, $w \in V$ |

66 |
\State $\sigma[t] \leftarrow 0, t \in V$; $\sigma[s] \leftarrow 1$ |

67 |
\State $d[t] \leftarrow -1, t \in V$; $d[s] \leftarrow 0$ |

68 |
\State $Q \leftarrow $ empty queue |

69 |
\State enqueue $s \rightarrow Q$ |

70 |
\While{$Q$ \emph{not empty}} |

71 |
dequeue $v \leftarrow Q$ |

72 |
push $v \rightarrow S$ |

73 | |

74 |
\EndWhile |

75 |
\EndProcedure |

76 |
\end{algorithmic} |

77 |
\end{algorithm} |

78 | |

79 | |

80 |
\section{Summing all the pair-dependencies} \label{sec:computation_summing_pair_dependencies} |

81 |
\subsubsection{Naive ways} |

82 |
\subsubsection{Gradually accummulation of pair-dependencies} \label{sec:computation_accumulation_pair_dependencies} |

83 | |

84 | |

85 |
\section{Complexity of determining betweenness centrality as a whole} \label{sec:computation_summary} |

86 | |

87 |
\begin{table} [h] |

88 |
\caption{Complexity for the 1st step: counting the number and length of shortest paths} |

89 |
\label{tab:complexity_of_algorithms} |

90 |
\begin{tabular}{l|l|l} |

91 |
\toprule |

92 |
Algorithm name & Unweighted graphs & Weighted graph \\ |

93 |
\midrule |

94 |
Breadth-First Search & $\mathcal{O}(mn + n^2)$ & not applicable\\ |

95 |
Dijkstra & & $\mathcal{O}(mn + n^2 \log n)$ \\ |

96 |
Floyd/Warshall & & $\mathcal{O}(n^3)$ \\ |

97 |
\bottomrule |

98 |
\end{tabular} |

99 |
\end{table} |

100 | |

101 |
\begin{table} [h] |

102 |
\caption{Complexity for the 2nd step: summing all pair-dependencies} |

103 |
\label{tab:complexity_of_algorithms} |

104 |
\begin{tabular}{l|l|l} |

105 |
\toprule |

106 |
Algorithm name & Complexity \\ |

107 |
\midrule |

108 |
Naive summation & $\mathcal{O}(n^3)$ \\ |

109 |
\cite{Brandes01afaster} & $\mathcal{O}(mn)$\\ |

110 |
\bottomrule |

111 |
\end{tabular} |

112 |
\end{table} |

113 | |

114 |
\section{Heuristic algorithm for the exact betweenness centrality} \label{sec:computation_heuristic} |