Flood Modeller-TUFLOW Benchmarking
Flood Modeller-TUFLOW Benchmarking
Introduction
The fixed grid TUFLOW software is available in 2 solvers, an implicit finite-difference solver which is now branded as TUFLOW Classic, and an explicit finite volume engine called TUFLOW HPC which stands for Heavily Parallelised Compute. The solution scheme for TUFLOW HPC has highly independent calculations which can be parallelised and when run over multiple processors or on a GPU card provides a significant reduction in run times. The following wiki post highlights the potential speed up that could be achieved with utilisation of the TUFLOW HPC engine using GPU card technology:-
TUFLOW HPC and GPU Benchmarking
The benchmark models in these tests are 2D only models or 1D-2D models which utilise TUFLOW 1D ESTRY engine. The speed up is due to 2D calculations being distributed across multiple processors and undertaken in parallel. The impact of the parallel processing is also seen in integrated 1D-2D models. As well as parallelisation of the 2D calculations, improvements have also been made to the indexing of 1D-2D links that can substantially reduce simulation times for large linked 1D-2D models.
In conjunction with TUFLOW's own 1D engine which allows the modelling and integration of pipe network models and river models together with a 2D domain, TUFLOW can also be integrated with a range of other 1-Dimensional software including Flood Modeller Pro amongst others.
Flood Modeller Pro-TUFLOW Benchmarking
A common query is what is the speed up of running simulations utilising TUFLOW HPC and GPU technology when TUFLOW models are linked to external 1D schemes such as Flood Modeller Pro. Therefore, some benchmarking runs of linked Flood Modeller Pro-TUFLOW models has been undertaken to demonstrate the impact of running the linked models with the TUFLOW HPC solver on a single processor, multiple cores as well as utilising Nvidia GPU cards. The outputs from these tests have been compared against the TUFLOW Classic engine, which due to the various inter-dependencies in the matrix solver cannot efficiently distribute calculations across multiple processors.
The benchmark simulations have been undertaken internally and due to model ownership and licensing issues cannot be supplied externally. However, if you would like to test your own Flood Modeller Pro-TUFLOW models on TUFLOW HPC GPU, we would be happy to run them on our machines to demonstrate the potential speed up. The models are real world example models typical of the kind that Flood Modeller Pro and TUFLOW are used on.
Model 1
The first simulated model was a 287 node Flood Modeller Pro network coupled with a TUFLOW model comprising of 297,746 cells with a 2m resolution. The input boundaries corresponded to a 100 year return period. The model was run for 10 hours of hydrograph time.
The models were run on a number of different setups and run for 5 scenarios. The 5 scenarios are as follows:
- TUFLOW Classic with a fixed timestep of 0.5 seconds
- TUFLOW HPC run on 2 Cores
- TUFLOW HPC run on 4 cores
- TUFLOW HPC run on 8 cores
- TUFLOW HPC run on a Nvidia GPU card. The Nvidia GPU card varies depending on the machine being used.
The TUFLOW version used for the simulations was the 2018-03-AC release and Flood Modeller 4.4.
The results of the benchmarking of model 1 are shown in Table 1. The implicit TUFLOW Classic scheme is comparable to the TUFLOW HPC when HPC is run on a couple of cores. This is due to the smaller timestep used in the explicit finite volume scheme which uses an adaptive timestep. As the number of cores used increases, the linked Flood Modeller Pro-TUFLOW HPC simulation is quicker, with run times reduced by up to 42%. A GPU card has a significant number of cores within it which can be utilised to undertake the 2D calculations. Running the same model on a GPU card leads to up to an 88% reduction in run times compared to TUFLOW Classic (and up to 84% compared to TUFLOW HPC without utilising a GPU Card). This means that nearly 8 simulations can be run on a GPU card in the same time that a single simulation could be run using TUFLOW Classic, highlighting the efficiencies of investing in hardware and also GPU modules.
Table 1: Runtimes for Flood Modeller Pro-TUFLOW benchmarks
| Processor Name | Graphic Card | GPU RAM (GB) | TUFLOW Classic Runtime (s) | TUFLOW HPC Runtime (s) on 2 Cores | TUFLOW HPC Runtime (s) on 4 Cores | TUFLOW HPC Runtime (s) on 8 Cores | TUFLOW HPC Runtime (s) on GPU Card | % Speed up between TUFLOW Classic and TUFLOW HPC on a GPU | 
|---|---|---|---|---|---|---|---|---|
| Intel(R) Core(TM) i7-7820x CPU @ 3.60GHz | NVIDIA GeForce GTX 1080 | 8 | 6041 | 8071 | 4718 | 3502 | 1022 | 83% | 
| Intel(R) Core(TM) i7-7820x CPU @ 3.6 Ghz | Nvidia GeForce RTX 2080 ti | 11 | 5944 | 8611 | 5666 | 4567 | 726 | 88% | 
| Intel(R) Core(TM) i7-7700 HQ CPU @ 2.8 Ghz | Nvidia GeForce GTX 1050 | 4 | 6622 | 10867 | 7857 | N/A | 3171 | 52% | 
Model 2
The second model contains 228 Flood Modeller Pro nodes, a 5m resolution TUFLOW grid with a total of 570,214 cells. The model was run for 26 hours of model time with a 100 year boundary condition. The TUFLOW Classic model was run with a 1.25 second timestep. Table 2 shows the run times for model 2 when run with the same scenarios as model 1. The table shows that model run times can be up to 82% quicker when run on a GPU compared to TUFLOW Classic. Even when comparing the TUFLOW HPC GPU run time with the TUFLOW HPC run time when run utilising 8 cores, the simulation can be up to 74% quicker.
Table 2: Runtimes for Flood Modeller Pro-TUFLOW benchmarks for Model 2
| Processor Name | Graphic Card | GPU RAM (GB) | TUFLOW Classic Runtime (s) | TUFLOW HPC Runtime (s) on 2 Cores | TUFLOW HPC Runtime (s) on 4 Cores | TUFLOW HPC Runtime (s) on 8 Cores | TUFLOW HPC Runtime (s) on GPU Card | % Speed up between TUFLOW Classic and TUFLOW HPC on a GPU | 
|---|---|---|---|---|---|---|---|---|
| Intel(R) Core(TM) i7-7820x CPU @ 3.6 Ghz | Nvidia GeForce RTX 2080 ti | 11 | 5946 | 10229 | 6543 | 4722 | 1251 | 79% | 
| Intel(R) Xeon(TM) E5-2670 v3 CPU @ 2.3 Ghz | Nvidia GeForce GTX 980 | 4 | 19693 | 21242 | 14438 | 9516 | 3603 | 82% | 
| Intel(R) Core(TM) i7-770HQ CPU @ 2.8 Ghz | Nvidia GeForce GTX 1050 | 4 | 6840 | 20816 | XX | N/A | 3886 | 43% | 
Discussion
The results from the benchmarking of the hydraulic runs clearly show the impact of running models on TUFLOW HPC compared to TUFLOW Classic, and in particular the significance of running TUFLOW HPC on a GPU to significantly reduce model run times. The results show that linked Flood Modeller Pro-TUFLOW HPC run times are up to 8 times faster than TUFLOW Classic and up to 6 times faster than TUFLOW HPC when run on 8 CPU. Reducing model run times allows more efficiency by reducing the time taken to wait for simulations when various schematisation options are tested during the model build phase, increase productivity when running through the final design runs over a suite of return periods and durations, reducing licensing costs, and open up opportunities in terms of uncertainty analysis through Monte-Carlo type approaches which require a large ensemble of simulations to be conducted. TUFLOW-HPC can also be run on multiple GPU cards, which allows for the modelling of very large 2D model domains.
The simulation times were for running the models using the latest release of Flood Modeller Pro 4.4. Flood Modeller Pro 4.4 although supporting TUFLOW HPC and GPU cards does not support the concurrent simulation of Flood Modeller Pro, TUFLOW HPC and TUFLOW’s 1D scheme, ESTRY. The ability to run all three simultaneously linked together is scheduled to be available in the Flood Modeller Pro 4.5 release which has been provided in a beta testing phase for this analysis. The ability to run Flood Modeller Pro, TUFLOW 2D and TUFLOW 1D components allows the running of a fully integrated drainage models comprising open channels, 2D floodplains as well as 1-Dimensional pipe networks and their interaction with the surface as represented by the 2D domain. The benchmarking of these types of models is currently being undertaken.