## root / thesis / ch3_computation_of_bc.tex @ df1b1a42

History | View | Annotate | Download (6.79 KB)

1 | df1b1a42 | Quynh PX Nguyen | %!TEX root = main.tex |
---|---|---|---|

2 | |||

3 | \chapter{Computation of Betweenness Centrality} \label{sec:computation_betweenness} |
||

4 | Recall the definition of betweenness centrality: |
||

5 | |||

6 | \begin{equation*} |
||

7 | \label{eq:betweenness_centrality} |
||

8 | c_B(v) = \sum_{s \neq t \neq v \in V} \delta_{st}(v) = \sum_{s \neq t \neq v \in V} \frac{\sigma_{st}(v)}{\sigma_{st}} |
||

9 | \end{equation*} |
||

10 | |||

11 | Traditionally, just by looking at the equation, we normally divide the task of computing betweenness centrality into 2 steps: |
||

12 | |||

13 | \begin{itemize} |
||

14 | \item First we need to compute tables with length and number of shortest paths between all pairs of vertices in the network. |
||

15 | \item Second for each vertex $v$, we need to consider all the pair $s, t$, we use the tables calculated above to determine the fraction of shortest paths between $s, t$ that are passing through $v$. Summing all the fraction for every possible $s, t$ and we arrive at the $c_B(v)$. This way of summing is referred to by the term \emph{naive summation of pair-dependencies} |
||

16 | \end{itemize} |
||

17 | |||

18 | The chapter starts with presenting some important concept in \cref{sec:computation_important_concepts}. Then \cref{sec:computation_counting_shortest_paths} gives an overview on classic ways to find the shortest paths. And \cref{sec:computation_summing_pair_dependencies} introduces 2 ways to sum all the pair-dependencies: (i) the naive way, (ii) the faster approach to accummulate pair-dependencies gradually by \cite{Brandes01afaster}. Up until now, the approach in \cite{Brandes01afaster} is the state-of-the-art in calculating the exact betweenness centrality, and it is implemented in many standard graph libraries such as |
||

19 | \emph{networkx} \footnote{https://networkx.github.io/}, |
||

20 | \emph{Boost Graph Library} \footnote{www.boost.org/libs/graph/doc/} |
||

21 | The \cref{sec:computation_summary} intends to summarize the result of two previous section with \cref{tab:complexity_of_algorithms} recaping the complexity of some algorithms used in determining betweenness centrality. |
||

22 | |||

23 | After having a throughout background, \cref{sec:computation_heuristic} presents the heuristic way to calculate also the exact betweenness centrality by \cite{PuzisZEDB12}. However, the complexity of the Heuristic Betweenness Centrality is not guaranteed to perform better than the algorithm of \cite{Brandes01afaster}, its performance depends on the topology. |
||

24 | |||

25 | \section{Important concepts} \label{sec:computation_important_concepts} |
||

26 | Define the set of \emph{predecessors} of a vertex $v$ on the shortest path from $s$ as: |
||

27 | |||

28 | \begin{equation} \label{eq:predecessor} |
||

29 | P_s(v) = \{ u \in V : {u, v} \in E, d(s, v) = d(s, u) + \omega(u, v) \} |
||

30 | \end{equation} |
||

31 | |||

32 | where $d(s,u)$ is the minimum length of any path connecting $s$ and $u$. And $\omega(u, v)$ is the weight of the edge ${u, v}$. |
||

33 | |||

34 | Given $P_s(v)$, then we have the following relation: |
||

35 | \begin{equation} |
||

36 | \sigma_{st} = \sum_{u \in P_s(v)}{\sigma_{su}} |
||

37 | \end{equation} |
||

38 | |||

39 | |||

40 | \section{Counting the number of shortest paths} \label{sec:computation_counting_shortest_paths} |
||

41 | |||

42 | Before we begin, there are 2 types of shortest paths problem. The first one is called Single Source Shortest Paths (SSSP), where we compute the shortest paths between one given source vertex $s$ to all other vertices. The second one is All-Pairs Shortest Paths (APSP) where we compute the shortest paths between all vertex pairs. |
||

43 | |||

44 | The SSSP is a classical problem in Graph Theory, and it is usually solved by Breadth-First Search (BFS) for unweighted graphs in $\mathcal{O}(m + n)$, and Dijkstra's algorithm for the weighted graphs in $\mathcal{O}(m + n \log n)$. |
||

45 | |||

46 | The APSP can be computed straightforwardly by applying algorithm for SSSP $n$ times, for all vertex $s \in V$. The resulting time complexity thus will be $\mathcal{O}(mn + n^2)$, and $\mathcal{O}(mn + n^2 \log n)$ respectively. In the sparse graph, it is a good practice to solving the SSSP multiple times. However, in non-sparse graph, where the number of edges is close to the maximal number of edges, i.e. $m \simeq n(n-1) \simeq n^2$, then it is better to use Floyd/Warshall algorithm with the complexity of $\mathcal{O}(n^3)$. |
||

47 | |||

48 | |||

49 | \textbf{XXX What is the input}: the source vertex $s$ |
||

50 | \textbf{XXX What is the output?}: |
||

51 | |||

52 | \subsection{Breadth-First Search for unweighted graph} |
||

53 | I understood the algorithm, but how can I explain it to another person? |
||

54 | |||

55 | |||

56 | |||

57 | \subsection{Dijkstra's algorithm for weighted graph} |
||

58 | The Dijkstra's algorithm can be modified so that when given a source vertex $s$, the output will be the set of predecessors and the number of shortest paths from $s$ to other $t \neq s \in V$ |
||

59 | |||

60 | \begin{algorithm} |
||

61 | \caption{Dijkstra's algorithm} \label{algo:dijkstra} |
||

62 | \begin{algorithmic}[1] |
||

63 | \Procedure{Dijkstra for SSSP with source vertex s}{} |
||

64 | \State $S \leftarrow$ empty stack |
||

65 | \State $P[w] \leftarrow$ empty list, $w \in V$ |
||

66 | \State $\sigma[t] \leftarrow 0, t \in V$; $\sigma[s] \leftarrow 1$ |
||

67 | \State $d[t] \leftarrow -1, t \in V$; $d[s] \leftarrow 0$ |
||

68 | \State $Q \leftarrow $ empty queue |
||

69 | \State enqueue $s \rightarrow Q$ |
||

70 | \While{$Q$ \emph{not empty}} |
||

71 | dequeue $v \leftarrow Q$ |
||

72 | push $v \rightarrow S$ |
||

73 | |||

74 | \EndWhile |
||

75 | \EndProcedure |
||

76 | \end{algorithmic} |
||

77 | \end{algorithm} |
||

78 | |||

79 | |||

80 | \section{Summing all the pair-dependencies} \label{sec:computation_summing_pair_dependencies} |
||

81 | \subsubsection{Naive ways} |
||

82 | \subsubsection{Gradually accummulation of pair-dependencies} \label{sec:computation_accumulation_pair_dependencies} |
||

83 | |||

84 | |||

85 | \section{Complexity of determining betweenness centrality as a whole} \label{sec:computation_summary} |
||

86 | |||

87 | \begin{table} [h] |
||

88 | \caption{Complexity for the 1st step: counting the number and length of shortest paths} |
||

89 | \label{tab:complexity_of_algorithms} |
||

90 | \begin{tabular}{l|l|l} |
||

91 | \toprule |
||

92 | Algorithm name & Unweighted graphs & Weighted graph \\ |
||

93 | \midrule |
||

94 | Breadth-First Search & $\mathcal{O}(mn + n^2)$ & not applicable\\ |
||

95 | Dijkstra & & $\mathcal{O}(mn + n^2 \log n)$ \\ |
||

96 | Floyd/Warshall & & $\mathcal{O}(n^3)$ \\ |
||

97 | \bottomrule |
||

98 | \end{tabular} |
||

99 | \end{table} |
||

100 | |||

101 | \begin{table} [h] |
||

102 | \caption{Complexity for the 2nd step: summing all pair-dependencies} |
||

103 | \label{tab:complexity_of_algorithms} |
||

104 | \begin{tabular}{l|l|l} |
||

105 | \toprule |
||

106 | Algorithm name & Complexity \\ |
||

107 | \midrule |
||

108 | Naive summation & $\mathcal{O}(n^3)$ \\ |
||

109 | \cite{Brandes01afaster} & $\mathcal{O}(mn)$\\ |
||

110 | \bottomrule |
||

111 | \end{tabular} |
||

112 | \end{table} |
||

113 | |||

114 | \section{Heuristic algorithm for the exact betweenness centrality} \label{sec:computation_heuristic} |