Benchmarks

Warning

The following benchmarks are for general orientation only. Benchmark results are highly platform dependent (e.g. processor frequency); your mileage may vary.

Varying problem size

This benchmark varies the size of the computational domain and records the runtime per iteration. The computations are executed on a single computing node with 25 CPU cores and an Nvidia Tesla K80 GPU.

We run the same model code with all RoGeR backends (numpy, numpy-mpi, jax, jax-mpi, jax-gpu).

../_images/SVAT_size_scaling.png
../_images/SVAT_size_speedup_numpy.png
../_images/energy_footprint_svat_iteration_size.png

As a rule of thumb, we find that JAX improves computational time (approximately 2 times faster) and reduces energy usage. GPUs are a competitive alternative to CPUs, as long as the problem fits into GPU memory.

Varying number of MPI processes

RoGeR is run for a fixed problem size, but varying number of processes. This allows us the evaluation of the scaling with increased CPU count. The problem size corresponds to 1 billion cells.

The computational benchmark experiment is executed on the bwForCluster BinAC cluster. Each computing node contains 28 CPUs.

../_images/SVAT_nproc_scaling.png

The results show that RoGeR scales well with increasing number of processes.