Running IoNS on a cluster
By default, the framework launches a local dask cluster with four workers, which is good enough for testing and preparing a recipe on a subset of data. But for evaluating large datasets, one will want to make better use of the resources available.
For utilizing a SLURM cluster, add the following parameters to the
run_recipe.py
call:
--slurm \ # use SLURM
--partition cm # SLRUM parttion to use
--nodelist i4,i5 \ # the SLURM nodelist, see SLURM documentation
--worker 16 # use 16 worker
--mem 2 # use 2GB memory per worker
The amount of workers should be set to the number of cores available on one
node. Be aware that pandas uses
numexpr to accelerate some
calculations and may spawn multiple threads in each worker. For better
performance, scalability and resource utilization, one should set
the environment variable NUMEXPR_MAX_THREADS
appropriately, either to 1
or
2
, after testing and observing with a small subset of the data. Default is the
number of cores or 8, whichever is less. This only sets the maximum
number of threads, to set the initial number of threads spawned to a specific
value, set NUMEXPR_NUM_THREADS
to the desired value.
See here
for details on configuring the numexpr threadpool.
Note that not every operation can be parallelized (and I/O speed can sometimes
limit performance as well), therefore the actual performance is often more
limited by the framework code and the code in the recipe than by the available
cores. Note also that the dask documentation suggests setting the number of numexpr
threads to one (see here and
here).
This might not be the most optimal suggestion for every workload, so to achieve
more optimal performance, some experimentation with the actual workload is most
likely necessary.
If you are using a SLURM node on a partition other than the default (check with
sinfo -Nl
), add --partition <partition_name>
to the call.
A already running dask cluster can be used by appending --cluster <cluster_address>
to the command line. See the setup
here
For debugging purposes a single-threaded mode is available, selectable by adding
--single-threaded
to the command line.