The NNGT package provides the possibility to use multithreaded algorithms to generate networks. This feature means that the computation is distributed on several CPUs and can be useful for:
However, the multithreading part concerns only the generation of the edges; if
a graph library such as graph-tool
, igraph
, or networkx
is used,
the building process of the graph object will be taken care of by this library.
Since this process is not multithreaded, obtaining the graph object can be much
longer than the actual generation process.
NNGT provides two types of parallelism:
nngt.set_config()
("multithreading", True)
or, setting the
number of threads, with nngt.set_config("omp", 8)
to use 8 threads.nngt.set_config("mpi", True)
. In that case,
the python script must be run as mpirun -n 8 python name_of_the_script.py
to be run in parallel.These two ways of running code in parallel differ widely, both regarding the situations in which they can be useful, and in the way the user should interact with the resulting graph.
The easiest tool, because it does not significantly differ from the single-thread case on the user side, is OpenMP, which is why we will describe it first. Using MPI is a lot different and will require the user to adapt the code to use it and will depend on the backend used.
When using parallel algorithms, additional care is necessary when dealing with random number generation. Here again, the situation differs between the OpenMP and MPI cases.
Warning
Never use the standard random module, only use numpy.random!
When using OpenMP, the parallel algorithms will use the random seeds defined
by the user through nngt.set_config("seeds", list_of_seeds)
. One seed per
thread is necessary.
These seeds are not used on the python level, so they are independent from
whatever random generation could happen using numpy
(e.g. to set node positions in space, or to generate attributes).
To make a simulation fully reproducible, the user must set both the random
seeds and the python level random number generators through the master seed.
For instance, with 4 threads:
master_seed = 0
nngt.set_config({"msd": master_seed, "seeds": [1, 2, 3, 4]})
Warning
This is also how you should initialize random numbers when using MPI!
This may surprise experienced MPI users, but NNGT is implemented in such a way that shared properties are generated on all threads through the initial python master seed, then generation algorithms save the current common state, then re-initialize the RNGs for parallel generation, and finally restore the previous, common random state once the parallel generation is done. Of course the parallel initialization differs every time, but it is changed in a reproducible way through the master seed.
Note
MPI algorithms are currently restricted to
gaussian_degree()
and
distance_rule()
only.
Handling MPI can be significantly more difficult than using OpenMP because it differs more strongly from the “standard” single-thread case.
NNGT provides two different ways of using MPI:
Warning
When using MPI with graph-tool, igraph, or networkx, all operations on the
graph that has been generated must be limited to the root process. To that
end, NNGT provides the on_master_process()
function that
returns True only on the root MPI process.
Using the ‘nngt’ backend, the edge_nb()
method, as well
as all other edge-related methods will return information on the local
edges only!
The python file should include (before any graph generation):
import nngt
msd = 0 # choose a master seed
seeds = [1, 2, 3, 4] # choose initial seeds, one per MPI process
nngt.set_config({
"mpi": True,
"backend": "nngt",
"msd": msd,
"seeds": seeds,
})
The file should then be executed using:
>>> mpirun -n 4 python name_of_the_script.py
Note
Graph saving is available in parallel in the fully distributed setup
through the to_file()
and save_to_file()
functions as in any other configuration.