Revision c6e2ce5a thesis/ch1_introduction.tex

View differences:

thesis/ch1_introduction.tex
1 1
%!TEX root = main.tex
2 2

  
3 3
\chapter{Introduction}
4
    People have a strong need for acquiring information. The faster, the more reliable, the more secured way, the more affordable mean to communicate, the better that communication technique is. During the 90s we have seen the boom of the Internet, at that time most devices was connected to the Internet through a cable with an ISP standing in between to provide services. In the late 2000s, connecting to the Internet without plugging anything in had became a norm. Most of the wireless networks that we are familiar with now such as 3G, LTE, 4G, Wi-Fi, they are centralized network, which means those networks rely on the base stations to broadcast signal, or access points in a managed wireless network to provide the service. And then, \textbf{XXX any year?}, Wireless Mesh Networks (WMNs) have merged as a key technology for next-generation wireless networking.
4
    People have a strong need for acquiring information, and they want faster, more reliable, more secured, and more affordable mean to communicate. During the 90s we have seen the boom of the Internet, at that time most devices was connected to the Internet through a cable with an ISP standing in between to provide services. In the late 2000s, connecting to the Internet without plugging anything in became a norm. Most of the wireless networks that we are familiar with now, such as 3G, LTE, 4G, Wi-Fi are centralized networks, which means those networks rely on base stations to broadcast signal, or access points in a managed wireless network to give the service. Wireless Mesh Networks (WMNs) in recent years are emerging as a key technology for next-generation wireless networking. They do not have any centralized control to route and forward packets, but rather all nodes cooperate together to relay each others' packets. The efficient protocol is important than ever in WMNs since it would affect the performance largely. And several papers \cite{Vazquez-Rodas:2013:TCW:2507248.2507257}, \cite{Maccari2014670} have proposed the use of centrality to enhance routing in WMNs.
5 5

  
6
    We begin, in this introductory chapter with the short overview of WMNs in \cref{sec:introduction_wmns}. Then we will present one application of WMNs in reality - the Wireless Community Networks (WCNs). We will cover the definition plus some challenging problems of WCNs in \cref{sec:introduction_wcns}. The chapter ends with the contribution, and the outline of the thesis.
6
    \emph{Centrality}\footnote{it is also referred to by other terms: centrality indices or centrality measures} is a measurement to quantify how influential a node is, given its position in the network. We concerns with centrality measurements since in WMNs, the ability to find central node can serve many great purposes such as adding more routers around those most central one to balance the load, and to lessen the impact when one of the most central routers fail. Or information about the central node can help with routing, to allow information to be spreaded faster in the network.
7 7

  
8
\section{Wireless Mesh Networks} \label{sec:introduction_wmns}
9
        Wireless Mesh Network (WMN) is any wireless network having a network topology of either a partial or full mesh topology \cite{zhang2006wireless}. In full mesh topology, every pair of nodes is directly connected with each other, while in partial mesh topology, not all nodes are directly interconnected with each other by other nodes. Eventhough the definition of WMN mentions the full mesh topology, in practice, WMN is usually built over a partial mesh topology. This is due to the limited coverage of signal from wireless devices making it impossible for each node to directly conect with all other nodes.
8
    In this thesis, the main focus is on one specific centrality measurement called \emph{betweenness centrality} $c_B(v)$. Intuitively, a node with high betweenness centrality has a large control over the traffic flowing within the network, given that the information is transfered using the shotest paths. And betweenness centrality defined as the sum of the fraction of total number of shortest paths that pass through that node. The best known algorithm \cite{Brandes01afaster} to calculate $c_B(v)$ runs in quadratic time, making the calculation for a large graph impractical. Another algorithm in focus is heuristic-based \cite{PuzisZEDB12} which demonstrated its effectiveness in several input graphs of choice. The main work of thesis is implementing those forementioned algorithms, and carrying out the performance evaluation for both of them on a computer, and on a low-cost wireless router. We want to know whether using the heuristic-based algorithm to calculate the betweenness centrality for every nodes in topology of WCNs would scale better than using the original algorithm.
9

  
10
    We begin with a short overview of WMNs in \cref{sec:introduction_wmns}. Then we present one application of WMNs in reality -- the Wireless Community Networks (WCNs). We cover the definition plus some challenging problems of WCNs in \cref{sec:introduction_wcns}. After having all the background knowledge for WMNs and WCNs, the contribution of this thesis for WCNs is laid out in \cref{sec:introduction_contribution}. The outline for the subsequent chapters can be found in \cref{sec:introduction_outline}.
10 11

  
11
        For classification, WMNs belong to a category of multi-hop wireless network, where information traverses from a source to a destination using two or more wireless hops.
12
\section{Wireless Mesh Networks} \label{sec:introduction_wmns}
13
        A WMN is any wireless network having a mesh topology\footnote{Mesh topology is used to denote any network that every nodes in it participating in the distribution of data}. In full mesh topology, every pair of nodes is directly connected with each other, while in partial mesh topology, not all nodes are directly interconnected with each other. Even though the definition of WMN mentions the full mesh topology, in practice, WMN is usually built over a partial mesh topology. This is due to the limited coverage of signal from wireless devices making it impossible for each node to directly connect with all other nodes.
12 14

  
13
        The main advantages of a WMN is its capability of fault intolerance. Since WMN is a decentralized wireless network, when several nodes is taking down, the network can find a new path way to link the network together. The term \emph{autonomic system} is normally used to describe properties of WMN.
15
        For classification, WMNs belong to a class of multi-hop wireless network, where information traverses from a source to a destination using two or more wireless hops.
14 16

  
15
        WMN has several properties:
16
        \begin{itemize}
17
            \item Self-forming
18
            \item Self healing
19
            \item Self optimization
20
            \item Multi hop
21
        \end{itemize}
17
        The main advantages of a WMN is its ability of fault intolerance. Since WMN is a decentralized wireless network, when several nodes fail, the network can find a new path way to link the network together. The term \emph{autonomic system} is often used to describe properties of WMN.
22 18

  
23
        WMNs find a great applications in many areas \citep{Akyildiz2005445} such as: broadband home networking, enterprise networking, community networking, city-wide wireless coverage, spontaneous mesh network, transporation systems. However, all different areas for application can be classified into 2 categories. Firstly, WMN acts as a backbone network to extend the Internet connectivity for nodes that are not in the coverage of Internet access. This might due to the fact that installing wired infrastructure is too expensive for rural areas, or areas with challenging terrains. Another scenario is in a diaster-stroke area where the current communication facilities (cables, base stations) were destroyed, then WMNs can provide temporary solution for communication. Secondly, WMNs is used to enable the communication between nodes within the mesh network.
19
        WMNs find interesting applications in many areas \citep{Akyildiz2005445} such as: broadband home networking, enterprise networking, community networking, city-wide wireless coverage, spontaneous mesh network, transportation systems. All the different areas of application can be classified into two categories. First, the WMN acts as a backbone network to extend the Internet connectivity for nodes that are not in the coverage of Internet access. This might due to the fact that installing wired infrastructure is too expensive for rural areas, or areas with challenging terrains. Another scenario is in a disaster-stroke area where the current communication facilities (cables, base stations) were destroyed, then WMNs can provide temporary solution for communication. There are many examples for this usage in reality. It has been used in Dharamsala \citep{jardin2006} to provide internet access for refugee monks living in the mountainous area in Dharamsala. \cite{Suzuki2006} introduced \emph{SKYMESH} to provide Internet access in a disaster area. It use helium-filled balloons carrying WiFi transceivers, and together they form the mesh network.
24 20

  
25
        For the first category - extending the Internet connectivity, there are many examples on its usage in reality. It has been used in Dharamsala \citep{jardin2006} to provide internet access for refugee monks living in the mountainous area in Dharamsala. \cite{Suzuki2006} introduced \emph{SKYMESH} to provide Internet access in a disaster area. It use helium-filled balloons carrying WiFi transceivers, and together they form the mesh network.
21
        The second application of WMNs focus on the communication within the network. For example, in transportation management, \cite{Ahmed2008} proposed a prototype using wireless mesh networks to help with monitoring traffic. The car has sensors to collect data, and the real-time data is gathered using the WMN. Wireless Community Networks are another applications making use of mesh topology. The WCN contains a group of local users setting up and managing a network for themselves. Actually, WCNs can have a gateway to the Internet, and it can act as a backbone to deliver Internet connectivity. However, in reality, most WCNs were set up to promote an alternative communication channel that offers higher degree of privacy and neutrality \cite{Maccari2015175}.WCNs is discussed more in \cref{sec:introduction_wcns}.
26 22

  
27
        The second usage of WMNs focus on the communication within the network. For example, in transportation management, \cite{Ahmed2008} proposed a prototype using wireless mesh networks to help with monitoring traffic. The car has sensors to collect data, and the real-time data is gathered using the WMN. Wireless Community Networks are another applications of WMNs, where a group of local users set up and manage a network for themselves. WCNs can have a gateway (or not) to the Internet. This WCNs will be discussed more in \cref{sec:introduction_wcns}.
28 23

  
29 24
\section{Wireless Community Networks} \label{sec:introduction_wcns}
30 25
Application:
31
    Wireless Community Network (WCN) \cite{Jain2003} belongs to wireless mesh network. It also servers 2 purposes as WMN: to provide internet access for those nodes which do not have it, and to create an environment for communication that offer higher privacy.
26
    Wireless Community Network (WCN) \cite{Jain2003} is one application of Wireless Mesh Network. WCN is created by a group of local users to have another channel of communication that is self-managed and offer higher degree of privacy. WCNs have their presents on all continents, but most of them are in Europe and North America. The largest WCN all over the world now is Guifi\footnote{https://guifi.net} in Spain with over 30000 nodes in 2015.
32 27

  
33
    What
28
    For WCNs, new router can join the existing WCNs independently by following the technical information listed on the Web. Every nodes in WCNs must have a way to know the existence of the new router, or the unavailability of some other router when it leaves the network. There are several popular routing protocols used in WCNs, taking into account the ever-changing topology of WCNs to deal with that problem. They are Optimized Link State Routing (OLSR), Babel, and B.A.T.M.A.N. But probably OLSR is the most widespread routing protocol in used. More thorough discussion for the mentioned protocols can be found in \cite{Pisa2013}.
34 29

  
35
http://www.npr.org/2006/08/10/5631353/a-wireless-network-for-little-lhasa
30
    In this thesis, we focus on three WCNs: Funk Feuer Wien (FFW)\footnote{https://www.funkfeuer.at/} and Funk Feuer Graz (FFG)\footnote{graz.funkfeuer.at/en/index.html} in Austria and Ninux (NNX)\footnote{wiki.ninux.org/} in Italy. And they all use Optimized Link State Routing Protocol (OLSR), where every nodes in the WCNs have the topology information, and compute the next hop destinations for all nodes in the network using shortest hop forwarding paths.
36 31

  
37
    \cite{Maccari2015175} presented an argument that closeness centrality can be used
38 32

  
39 33
\section{Contributions of this thesis} \label{sec:introduction_contribution}
40
    In graph theory, centrality metrics (which will be explored in depth in \cref{sec:centrality}) is used to determine how influential a node is in the network. \cite{Katsaros2010} presented two types of centrality metrics that he considered them to have significant applications in analysing wireless networks. They are degree based centrality, and shortest-path based centrality.
41

  
42
    \cite{Maccari2015175} proposed to use \emph{closeness group centrality} $C_C(\gamma)$ - one type of degree based centrality - to identify the central group in the WCNs. $\gamma$ is a group of nodes. Knowing the central group, the group with the lowest $C_C(\gamma)$ can be useful in, for example, determining where to place the Internet gateway. Suppose that this WCNs is designed to provide Internet access for its client, then the most central group will give the best average Internet connectivity to the nodes in the network.
43

  
44
    The type of centrality metrics to choose will depend on the purpose of analysis. \cite{Maccari2014670} also mentioned shortest-path based centrality $C_sp(k)$ should be employed when we want choose a set of nodes that have control a fraction $\eta$ of the entire traffic of the network. \citeauthor{Maccari2014670} gave 3 scenarios on why knowing betweenness centrality would help in resolving those 3 problems. Given that WCNs is not a static network, nodes can come and leave the network as they wish. Therefore, the betweenness centrality would change over time. Moreover, routers do not have enough computational power to excecute complex algorithm. Therefore, to make it feasible for WCNs to detect the set of most influential nodes, the algorithm must run fast enough \textbf{XXX how fast?} on the computationally limited physical device.
45

  
46
    In this thesis, I test out the performance of shortest-path based betweenness centrality (from now on it will only be called \emph{betweenness centrality} algorithm on the real router. The state-of-the-art for calculating betweenness centrality is $\mathcal{O}(mn + n^2 \log n)$ \citep{Brandes01afaster}. Eventhough it is a huge improvement comparing to $\mathcal{O}(n^3)$, it is still not feasible to compute that centrality for large graph. The second algorithm is the heuristic-based by \cite{PuzisZEDB12}. This algorithm follow a divide-and-conquer design paradigm, the network is divided into bi-connected components. Betweenness centrality is calculated for each bi-connected components with a little bit of modification, and then they are summed up to achieve the same result as the state-of-the-art algorithm. The presentation of both algorithms can be found in \cref{sec:computation_betweenness}
47

  
48
    Since the algorithm of \citeauthor{PuzisZEDB12} is a heuristic-based, there is no guarantee on the running time. Therefore, I carry out an extensive experiments to compare 2 algorithms. The experiment is run on random graphs: Erdos, Power Law, Community Network \footnote{This is not the real wireless community networks, but the generation algorithm was designed such that the topology is similar to the topology found in WCNs}. And the performance is measured in the virtualized environment with QEMU \footnote{http://wiki.qemu.org}, and on the real router.
49

  
50
    This is the first time that both algorithms of betweenness centrality has been tested on the real device.
51

  
52
\section{Outline of this thesis} \label{sec:introduction_outline}
53
    The outline of this paper is as follows. The first two chapters cover all the necessasry background information. Section \cref{sec:centrality} examines the concept of centrality. Since there are numereous measures for centrality, only those centrality measures that are most relevant  will be presented. And betweenness centrality will be discussed in deep. Section \cref{sec:computation_betweenness} presents 2 algorithms to calculate the exact betweenness.
34
    In graph theory, centrality metrics (which is explained in depth in \cref{sec:centrality}) is used to determine how influential a node is in the network. The authors in \cite{Katsaros2010} presented two types of centrality metrics that he considered them to have significant applications in analysing wireless networks. They are degree based centrality, and shortest-path based centrality.
54 35

  
55
    The next two chapters deals with the setting up and results of the simulation. Details on the dataset, repetition will be provided in \cref{sec:sim_scenario}. \cref{sec:sim_experiment} summarises the result of the simulation.
36
    The authors in \cite{Maccari2015175} proposed to use \emph{closeness group centrality} $C_C(\gamma)$ -- one type of degree based centrality -- to identify the central group in the WCNs. $\gamma$ is a group of nodes. Knowing the central group, the group with the lowest $C_C(\gamma)$ can be useful in, such as, determining where to place the Internet gateway. Suppose that this WCNs is designed to provide Internet access for its client, then the most central group will give the best average Internet connectivity to the nodes in the network.
56 37

  
57
\clearpage
38
    The type of centrality metrics to choose depends on the purpose of analysis. \cite{Maccari2014670} also mentioned shortest-path based centrality $C_{sp}(k)$ should be employed when we want choose a set of nodes that have control a fraction $\eta$ of the entire traffic of the network. There are 3 scenarios in which betweenness centrality would be employed. Given that WCNs is not a static network, nodes can join and leave the network as they wish. Moreover, there is no single machine managing the whole network, but rather every router has the whole information of the topology and cooperates together to keep the information flowing freely in the network. In sum, we know that betweenness centrality is a useful measure for WCN, and that computing betweenness centrality must not be too demanding for the routers since routers do not have enough computational power to execute as quick as the normal computer. Therefore, to make it feasible for WCNs to detect the set of most influential nodes, the algorithm must run fast enough on the computationally limited physical device.
58 39

  
59
\section{Scope}
60
        The scope of the paper:
61
        \begin{itemize}
62
            \item no self-loop
63
            \item no multigraph: graph with no parallel edges
64
            \item unweighted and weighted graph
65
            \item can apply to connected or disconnected graph. In the disconnected graph, certain vertices are not reachable from vertices in a different component. This yeilds the zero shortest paths counts, and as a result those 2 endpoints contributes 0 toward the betweenness.
66
        \end{itemize}
40
    In this thesis, I test the performance of shortest-path based betweenness centrality (from now on it is called \emph{betweenness centrality} algorithm on a real router typically used in WCNs. The state-of-the-art for calculating betweenness centrality is $\mathcal{O}(mn + n^2 \log n)$ \citep{Brandes01afaster}. Even though it is a huge improvement comparing to $\mathcal{O}(n^3)$, it is still not feasible to compute centrality in large graphs. The second algorithm is the heuristic-based presented in \cite{PuzisZEDB12}. This algorithm follow a divide-and-conquer design paradigm, the network is divided into bi-connected components. Betweenness centrality is calculated for each bi-connected components with a bit of modification, and then they are summed up to achieve the same result as the latest algorithm. The presentation of both algorithms can be found in \cref{sec:computation_betweenness}.
67 41

  
68
    connected graph: since the betweenness centrality is calculated based on shortest paths. In the disconnected graph, certain vertices are not reachable from vertices in a different component. This yields the zero shortest paths counts for betweenness centrality.
42
    Since the algorithm of \cite{PuzisZEDB12} is heuristic-based, there is no guarantee on the running time.  I carry out an extensive experiments to compare the two algorithms. The experiment is run on synthetic graphs: Erdos, Power Law, Community Network\footnote{This is not the real wireless community networks, but the generation algorithm was designed such that the topology is similar to the topology found in WCNs}. And it is also run on three real wireless community networks: Funk Feuer Wien (FFW)\footnote{https://www.funkfeuer.at/} and Funk Feuer Graz (FFG)\footnote{graz.funkfeuer.at/en/index.html} in Austria, and Ninux\footnote{wiki.ninux.org/} in Italy. And the performance is measured in the virtualized environment with QEMU\footnote{http://wiki.qemu.org}, and on the real router. More details can be found in \cref{sec:experiment_input}.
69 43

  
44
    This is the first time that both algorithms for calculating betweenness centrality has been tested on the real device. They both have the same polynomial time-bound, but we show later that heuristic modification can greatly improve the running time for the Community Network graphs.
70 45

  
71
\section{Notation}
72
    A graph $G=(V,E)$ contains a set of vertices (nodes) $V$ and a set of edges (links) $E$. We assume we assume that al graphs are undirected and connected. \textbf{XXX check the modification for directed graph}
73 46

  
74
    $m = |E|$ is the number of edges in graph $G$
47
\section{Outline of the thesis} \label{sec:introduction_outline}
48
    The outline of this paper is as follows. The first two chapters cover all the necessary background information. Section \cref{sec:centrality} examines the concept of centrality. Since there are numerous measures for centrality, only those centrality measures that are most relevant is presented. And betweenness centrality is going to be discussed in deep since it is the measure that we we choose to evaluate the performance for this thesis. Section \cref{sec:computation_betweenness} presents two algorithms to calculate the exact betweenness. One is the state-of-the-art algorithm, with the mathematically proven upper bound at $\mathcal{O}(mn + n^2 \log n)$. Another algorithm is based on heuristic for speeding up the betweenness centrality computation. And since there is no upper bound for the heuristic algorithm, we want to compare both algorithms to see the performance of both.
75 49

  
76
    $n = |V|$ is the number of vertices in graph $G$
50
    The next two chapters deals with the experimental to evaluate performance of these two forementioned algorithms. \cref{sec:experiment} gives detail on the experiment setup. This includes the description of the synthetic graphs, and real WCNs topology that we use in the experiment. \cref{sec:experiment_result} summarises the results of the experiment and gives the justification for the results.
77 51

  
78
    Let $\omega$ be a weight function on the edges. We assume that $\omega(e) > 0, e \in E$ for weighted graphs. And we define $\omega(e) = 1, e \in E$ for unweighted graphs. If for $s, t \in V$ and ${s, t} \in E$, then we can also use $\omega(s, t)$ to denote the weight of the edge between $s$ and $t$.
79 52

  
80
    The \emph{length} of a path is the sum of weights of all edges along the path. We use $d(s, t)$ to denote the \emph{distance} between vertices $s$ and $t$, i.e. the minimum length of any path connecting $s$ and $t$.

Also available in: Unified diff