HPC FAQ: Difference between revisions
Content deleted Content added
| (4 intermediate revisions by 4 users not shown) | |||
Line 28:
'''Over utilisation of CPU threads/cores'''<br>
Trying to run multiple HPC simulations across the same CPU threads. If, for example, you have 4 CPU threads on your computer and you run two simulations that both request 4 threads, then effectively you are overloading the CPU hardware by requesting 8 threads in total. This will slow down the simulations by more than a factor of 2. The most efficient approach in this case is to run both simulations using 2 threads each, noting that if you are performing other CPU intensive tasks, this also needs to be considered.<br>
By default, from the 2020-01 release onwards the number of CPU threads taken is four (4). Previously, the default was two (2). You can control the number of threads requested by either using the <font color="blue"><tt>-nt<number_threads></tt></font> run time option, e.g. -nt2, or use the TCF command <font color="blue"><tt>CPU Threads</tt></font>. The -nt run time option overrides <font color="blue"><tt>CPU Threads</tt></font>.<br>
Note: If Windows hyperthreading is active there typically will be two threads for each physical core. For computationally intensive processes such as TUFLOW, it is recommended that hyperthreading is deactivated so there is one thread for each core.<br>
Line 84:
= With Wu turbulence scheme being the new default, are old models using Smagorinsky wrong? =
Turbulence is pronounced in areas of highly transient flow, e.g. high velocities, bends, ledges and flow contraction/expansion. Where the flow is more benign and/or bed roughness is high, turbulence is not so important as it only applies where there are strong spatial velocity gradients, for example, for uniform flow in a straight rectangular channel the turbulence term is zero as there is no spatial velocity gradient.<br>
The problem with Smagorinsky, which is a large scale eddy turbulence model originally developed for coastal modelling, is that it is cell size dependent (is proportional to cell surface area) and tends to zero as the cell size tends to zero. This has historically not been a major issue as cell sizes have typically been greater than the depth. The general recommendation in the <u>[https://docs.tuflow.com/classic-hpc/manual/latest/ TUFLOW Manual]</u> is to be careful of using cell sizes significantly smaller than the depth (see Section 1.4). However, as cells have been becoming finer and finer with the advent of GPU models this issue has increasingly emerged, and is particularly pertinent if using a quadtree or flexible mesh and very small cells relative to the depth are being used. If this is the case, bigger differences will be present for bigger events where the water level and velocities are higher.<br>
TUFLOW, many years ago, changed from purely Constant or purely Smagorinsky to Smagorinsky plus (a small amount of) Constant. This improved absorption of eddies into the streamlines behind a bluff body and helped by varying degrees the modelling at finer cell sizes. This was working well in the time being, however new cell size turbulence scheme has now been implemented to help with the situation even further. <br>
The Smagorinsky/Constant turbulence combination has served the industry well and can continue to be used where the cell sizes are not significantly smaller than the depth where highly transient flows are occurring. If the model is well calibrated (using conventional parameters), continuing to use the Smagorinsky/Constant turbulence option is fine. Therefore, it is not considered that TUFLOW (or other good 2D solvers) have been producing questionable results, but that an improved turbulence representation is needed for 2D schemes with fine-scale cells. This is especially the case for the new Quadtree mesh option and for flexible meshes that utilise fine-scale cells.<br>
Line 100:
The below suggestions can be implemented to eliminate the instability and/or the decrease in control numbers:
* Specify initial water level for the whole model with <font color="blue"><tt>Set IWL</tt></font> command or locally with <font color="blue"><tt>Read GIS IWL</tt></font> command. The wet cells can limit the adaptive timestep through the <u>[[HPC_Adaptive_Timestepping#HPC_2D_Timestep | Shallow Wave Celerity Number]]</u>, and prevent the HPC solver from using big timesteps.
* Use <font color="blue"><tt>Timestep Maximum</tt></font> command to cap the maximum timestep to not get too high. A good Timestep Maximum value to start with might be a half the cell size in metres, e.g. if the cell size is 5m, the Timestep Maximum is to be 2.5 seconds. The .hpc.tlf file can be checked if further refinement is needed.
<br>
Line 114:
= Should I see 100% GPU utilisation when no other processes are running on GPU? =
HPC still does a small amount of work on CPU such as the model initiation and the final step of data reduction for model volume, control numbers, and stability checks. Frequent map outputs specifically for large datasets might also contribute to lower GPU utilisation as writing of outputs happens on CPU. Even in a perfect world and 2D only model it isn't possible to see 100% GPU utilisation. If there are any 1D features in the model the GPU utilisation will be even lower as 1D is processed on CPU only. A model with 1D ESTRY connection can potentially be doing a lot of work on CPU, perhaps as much as 90% CPU and 10% GPU. If the CPU hardware is not matched correctly with the GPU card it can become a bottleneck for HPC GPU runs even with a few 1D elements
<br>
| |||