Statistics
| Branch: | Revision:

iof-tools / networkxMiCe / networkx-master / doc / reference / introduction.rst @ 5cef0f13

History | View | Annotate | Download (13.6 KB)

1
Introduction
2
============
3

    
4
.. currentmodule:: networkx
5

    
6
The structure of NetworkX can be seen by the organization of its source code.
7
The package provides classes for graph objects, generators to create standard
8
graphs, IO routines for reading in existing datasets, algorithms to analyze
9
the resulting networks and some basic drawing tools.
10

    
11
Most of the NetworkX API is provided by functions which take a graph object
12
as an argument.  Methods of the graph object are limited to basic manipulation
13
and reporting.  This provides modularity of code and documentation.
14
It also makes it easier for newcomers to learn about the package in stages.
15
The source code for each module is meant to be easy to read and reading
16
this Python code is actually a good way to learn more about network algorithms,
17
but we have put a lot of effort into making the documentation sufficient and friendly.
18
If you have suggestions or questions please contact us by joining the
19
`NetworkX Google group <http://groups.google.com/group/networkx-discuss>`_.
20

    
21
Classes are named using ``CamelCase`` (capital letters at the start of each word).
22
functions, methods and variable names are ``lower_case_underscore`` (lowercase with
23
an underscore representing a space between words).
24

    
25

    
26
NetworkX Basics
27
---------------
28

    
29
After starting Python, import the networkx module with (the recommended way)
30

    
31
.. nbplot::
32

    
33
   >>> import networkx as nx
34

    
35
To save repetition, in the documentation we assume that
36
NetworkX has been imported this way.
37

    
38
If importing networkx fails, it means that Python cannot find the installed
39
module. Check your installation and your ``PYTHONPATH``.
40

    
41
The following basic graph types are provided as Python classes:
42

    
43
:class:`Graph`
44
   This class implements an undirected graph. It ignores
45
   multiple edges between two nodes.  It does allow self-loop
46
   edges between a node and itself.
47

    
48
:class:`DiGraph`
49
   Directed graphs, that is, graphs with directed edges.
50
   Provides operations common to directed graphs,
51
   (a subclass of Graph).
52

    
53
:class:`MultiGraph`
54
   A flexible graph class that allows multiple undirected edges between
55
   pairs of nodes.  The additional flexibility leads to some degradation
56
   in performance, though usually not significant.
57

    
58
:class:`MultiDiGraph`
59
   A directed version of a MultiGraph.
60

    
61
Empty graph-like objects are created with
62

    
63
.. nbplot::
64

    
65
   >>> G = nx.Graph()
66
   >>> G = nx.DiGraph()
67
   >>> G = nx.MultiGraph()
68
   >>> G = nx.MultiDiGraph()
69

    
70
All graph classes allow any :term:`hashable` object as a node.
71
Hashable objects include strings, tuples, integers, and more.
72
Arbitrary edge attributes such as weights and labels
73
can be associated with an edge.
74

    
75

    
76
The graph internal data structures are based on an
77
adjacency list representation and implemented using
78
Python :term:`dictionary` datastructures.
79
The graph adjacency structure is
80
implemented as a Python dictionary of
81
dictionaries; the outer dictionary is keyed by nodes to values that are
82
themselves dictionaries keyed by neighboring node to the
83
edge attributes associated with that edge.  This "dict-of-dicts" structure
84
allows fast addition, deletion, and lookup of nodes and neighbors in
85
large graphs.  The underlying datastructure is accessed directly
86
by methods (the programming interface "API") in the class definitions.
87
All functions, on the other hand, manipulate graph-like objects
88
solely via those API methods and not by acting directly on the datastructure.
89
This design allows for possible replacement of the 'dicts-of-dicts'-based
90
datastructure with an alternative datastructure that implements the
91
same methods.
92

    
93

    
94
Graphs
95
------
96

    
97
The first choice to be made when using NetworkX is what type of graph
98
object to use.  A graph (network) is a collection of nodes together
99
with a collection of edges that are pairs of nodes.  Attributes are
100
often associated with nodes and/or edges.  NetworkX graph objects come in
101
different flavors depending on two main properties of the network:
102

    
103
 - Directed: Are the edges **directed**?  Does the order of the edge
104
   pairs $(u, v)$ matter?  A directed graph is specified by the "Di"
105
   prefix in the class name, e.g. `DiGraph()`.  We make this distinction
106
   because many classical graph properties are defined differently for
107
   directed graphs.
108

    
109
 - Multi-edges: Are multiple edges allowed between each pair of nodes?
110
   As you might imagine, multiple edges requires a different data
111
   structure, though clever users could design edge data attributes to
112
   support this functionality.  We provide a standard data structure
113
   and interface for this type of graph using the prefix "Multi",
114
   e.g., `MultiGraph()`.
115

    
116
The basic graph classes are named:
117
:doc:`Graph </reference/classes/graph>`,
118
:doc:`DiGraph</reference/classes/digraph>`,
119
:doc:`MultiGraph </reference/classes/multigraph>`, and
120
:doc:`MultiDiGraph </reference/classes/multidigraph>`
121

    
122

    
123
Nodes and Edges
124
^^^^^^^^^^^^^^^
125

    
126
The next choice you have to make when specifying a graph is what kinds
127
of nodes and edges to use.
128

    
129
If the topology of the network is all you
130
care about then using integers or strings as the nodes makes sense and
131
you need not worry about edge data.  If you have a data structure
132
already in place to describe nodes you can simply use that structure
133
as your nodes provided it is :term:`hashable`.  If it is not hashable you can
134
use a unique identifier to represent the node and assign the data
135
as a :term:`node attribute`.
136

    
137
Edges often have data associated with them.  Arbitrary data
138
can be associated with edges as an :term:`edge attribute`.
139
If the data is numeric and the intent is to represent
140
a *weighted* graph then use the 'weight' keyword for the attribute.
141
Some of the graph algorithms, such as
142
Dijkstra's shortest path algorithm, use this attribute
143
name by default to get the weight for each edge.
144

    
145
Attributes can be assigned to an edge by using keyword/value
146
pairs when adding edges.  You can use any keyword
147
to name your attribute and can then query the edge
148
data using that attribute keyword.
149

    
150
Once you've decided how to encode the nodes and edges, and whether you have
151
an undirected/directed graph with or without multiedges you are ready to build
152
your network.
153

    
154
Graph Creation
155
--------------
156

    
157
NetworkX graph objects can be created in one of three ways:
158

    
159
- Graph generators---standard algorithms to create network topologies.
160
- Importing data from pre-existing (usually file) sources.
161
- Adding edges and nodes explicitly.
162

    
163
Explicit addition and removal of nodes/edges is the easiest to describe.
164
Each graph object supplies methods to manipulate the graph.  For example,
165

    
166
.. nbplot::
167

    
168
   >>> import networkx as nx
169
   >>> G = nx.Graph()
170
   >>> G.add_edge(1, 2)  # default edge data=1
171
   >>> G.add_edge(2, 3, weight=0.9)  # specify edge data
172

    
173
Edge attributes can be anything:
174

    
175
.. nbplot::
176

    
177
   >>> import math
178
   >>> G.add_edge('y', 'x', function=math.cos)
179
   >>> G.add_node(math.cos)  # any hashable can be a node
180

    
181
You can add many edges at one time:
182

    
183
.. nbplot::
184

    
185
   >>> elist = [(1, 2), (2, 3), (1, 4), (4, 2)]
186
   >>> G.add_edges_from(elist)
187
   >>> elist = [('a', 'b', 5.0), ('b', 'c', 3.0), ('a', 'c', 1.0), ('c', 'd', 7.3)]
188
   >>> G.add_weighted_edges_from(elist)
189

    
190
See the :doc:`/tutorial` for more examples.
191

    
192
Some basic graph operations such as union and intersection
193
are described in the :ref:`operators module <operators>` documentation.
194

    
195
Graph generators such as :func:`~generators.random_graphs.binomial_graph`
196
and :func:`~generators.random_graphs.erdos_renyi_graph` are
197
provided in the :ref:`graph generators <generators>` subpackage.
198

    
199
For importing network data from formats such as GML, GraphML, edge list text files
200
see the :ref:`reading and writing graphs <readwrite>` subpackage.
201

    
202

    
203
Graph Reporting
204
---------------
205

    
206
Class views provide basic reporting of nodes, neighbors, edges and degree.
207
These views provide iteration over the properties as well as membership
208
queries and data attribute lookup. The views refer to the graph data structure
209
so changes to the graph are reflected in the views. This is analogous to
210
dictionary views in Python 3. If you want to change the graph while iterating
211
you will need to use e.g. ``for e in list(G.edges):``. The views provide
212
set-like operations, e.g. union and intersection, as well as dict-like
213
lookup and iteration of the data attributes using ``G.edges[u, v]['color']``
214
and ``for e, datadict in G.edges.items():``. Methods ``G.edges.items()`` and
215
``G.edges.values()`` are familiar from python dicts. In addition ``G.edges.data()``
216
provides specific attribute iteration e.g. ``for e, e_color in G.edges.data('color'):``.
217

    
218
The basic graph relationship of an edge can be obtained in two ways.
219
One can look for neighbors of a node or one can look for edges.
220
We jokingly refer to people who focus on nodes/neighbors as node-centric
221
and people who focus on edges as edge-centric.  The designers of NetworkX
222
tend to be node-centric and view edges as a relationship between nodes.
223
You can see this by our choice of lookup notation like ``G[u]`` providing neighbors
224
(adjacency) while edge lookup is ``G.edges[u, v]``.
225
Most data structures for sparse graphs are essentially adjacency lists and so
226
fit this perspective. In the end, of course, it doesn't really matter which way
227
you examine the graph. ``G.edges`` removes duplicate representations of undirected
228
edges while neighbor reporting across all nodes will naturally report both directions.
229

    
230
Any properties that are more complicated than edges, neighbors and degree are
231
provided by functions.  For example ``nx.triangles(G, n)`` gives the number of triangles
232
which include node n as a vertex.  These functions are grouped in the code and
233
documentation under the term :ref:`algorithms<algorithms>`.
234

    
235

    
236
Algorithms
237
----------
238

    
239
A number of graph algorithms are provided with NetworkX.
240
These include shortest path, and breadth first search
241
(see :ref:`traversal<traversal>`),
242
clustering and isomorphism algorithms and others.  There are
243
many that we have not developed yet too.  If you implement a
244
graph algorithm that might be useful for others please let
245
us know through the
246
`NetworkX Google group <http://groups.google.com/group/networkx-discuss>`_
247
or the Github `Developer Zone <https://github.com/networkx/networkx>`_.
248

    
249
As an example here is code to use Dijkstra's algorithm to
250
find the shortest weighted path:
251

    
252
.. nbplot::
253

    
254
   >>> G = nx.Graph()
255
   >>> e = [('a', 'b', 0.3), ('b', 'c', 0.9), ('a', 'c', 0.5), ('c', 'd', 1.2)]
256
   >>> G.add_weighted_edges_from(e)
257
   >>> print(nx.dijkstra_path(G, 'a', 'd'))
258
   ['a', 'c', 'd']
259

    
260
Drawing
261
-------
262

    
263
While NetworkX is not designed as a network drawing tool, we provide
264
a simple interface to drawing packages and some simple layout algorithms.
265
We interface to the excellent Graphviz layout tools like dot and neato
266
with the (suggested) pygraphviz package or the pydot interface.
267
Drawing can be done using external programs or the Matplotlib Python
268
package.  Interactive GUI interfaces are possible, though not provided.
269
The drawing tools are provided in the module :ref:`drawing <drawing>`.
270

    
271
The basic drawing functions essentially place the nodes on a scatterplot
272
using the positions you provide via a dictionary or the positions are
273
computed with a layout function. The edges are lines between those dots.
274

    
275
.. nbplot::
276

    
277
   >>> import matplotlib.pyplot as plt
278
   >>> G = nx.cubical_graph()
279
   >>> plt.subplot(121)
280
   <matplotlib.axes._subplots.AxesSubplot object at ...>
281
   >>> nx.draw(G)   # default spring_layout
282
   >>> plt.subplot(122)
283
   <matplotlib.axes._subplots.AxesSubplot object at ...>
284
   >>> nx.draw(G, pos=nx.circular_layout(G), node_color='r', edge_color='b')
285

    
286
See the :doc:`examples </auto_examples/index>` for more ideas.
287

    
288
Data Structure
289
--------------
290

    
291
NetworkX uses a "dictionary of dictionaries of dictionaries" as the
292
basic network data structure.  This allows fast lookup with reasonable
293
storage for large sparse networks.  The keys are nodes so ``G[u]`` returns
294
an adjacency dictionary keyed by neighbor to the edge attribute
295
dictionary. A view of the adjacency data structure is provided
296
by the dict-like object ``G.adj`` as e.g. ``for node, nbrsdict in G.adj.items():``.
297
The expression ``G[u][v]`` returns the edge attribute dictionary itself.
298
A dictionary of lists would have also been possible, but not allow
299
fast edge detection nor convenient storage of edge data.
300

    
301
Advantages of dict-of-dicts-of-dicts data structure:
302

    
303
 - Find edges and remove edges with two dictionary look-ups.
304
 - Prefer to "lists" because of fast lookup with sparse storage.
305
 - Prefer to "sets" since data can be attached to edge.
306
 - ``G[u][v]`` returns the edge attribute dictionary.
307
 - ``n in G`` tests if node ``n`` is in graph ``G``.
308
 - ``for n in G:`` iterates through the graph.
309
 - ``for nbr in G[n]:`` iterates through neighbors.
310

    
311
As an example, here is a representation of an undirected graph with the
312
edges $(A, B)$ and $(B, C)$.
313

    
314
.. nbplot::
315

    
316
   >>> G = nx.Graph()
317
   >>> G.add_edge('A', 'B')
318
   >>> G.add_edge('B', 'C')
319
   >>> print(G.adj)
320
   {'A': {'B': {}}, 'B': {'A': {}, 'C': {}}, 'C': {'B': {}}}
321

    
322
The data structure gets morphed slightly for each base graph class.
323
For DiGraph two dict-of-dicts-of-dicts structures are provided, one
324
for successors (``G.succ``) and one for predecessors (``G.pred``).
325
For MultiGraph/MultiDiGraph we use a dict-of-dicts-of-dicts-of-dicts [#turtles]_
326
where the third dictionary is keyed by an edge key identifier to the fourth
327
dictionary which contains the edge attributes for that edge between
328
the two nodes.
329

    
330
Graphs provide two interfaces to the edge data attributes: adjacency
331
and edges. So ``G[u][v]['width']`` is the same as ``G.edges[u, v]['width']``.
332

    
333
.. nbplot::
334

    
335
   >>> G = nx.Graph()
336
   >>> G.add_edge(1, 2, color='red', weight=0.84, size=300)
337
   >>> print(G[1][2]['size'])
338
   300
339
   >>> print(G.edges[1, 2]['color'])
340
   red
341

    
342
.. code-links::
343

    
344
.. rubric:: Footnotes
345

    
346
.. [#turtles] "It's dictionaries all the way down."