HPC FAQ: Difference between revisions

Content deleted Content added
No edit summary
Line 112:
HPC still does a small amount of work on CPU such as the model initiation and the final step of data reduction for model volume, control numbers, and stability checks. Frequent map outputs specifically for large datasets might also contribute to lower GPU utilisation as writing of outputs happens on CPU. Even in a perfect world and 2D only model it isn't possible to see 100% GPU utilisation. If there are any 1D features in the model the GPU utilisation will be even lower as 1D is processed on CPU only. A model with 1D ESTRY connection can potentially be doing a lot of work on CPU, perhaps as much as 90% CPU and 10% GPU. If the CPU hardware is not matched correctly with the GPU card it can become a bottleneck for HPC-GPU runs even with a few 1D elements. We are investigating the possibility of parallelising 1D for future releases so it is able to run on GPU.<br>
<br>
 
= What is the difference between an implicit solution and an explicit solution? =
The implicit and explicit solutions are numerical methods allowing to move from a continuous to discrete model of the world in which quantities are computed at a finite number of points in distance and time (mesh or grid), approximating the partial differential equation by a system of algebraic equations.<br>
 
The mathematical description of many physical processes involve differential equations. In the case of the shallow water equations, the rate of change of depth and velocity at a particular point in space and time are described as functions of the depth and velocity, and the spatial gradients of the depth and velocity, at that location. The solution evolves according to its current state. The depth and velocity at some small time increment into the future can be estimated from the current values and their computed rates of change, but here is the problem: the rates of change keep changing and are a function of the solution as it evolves. The small change in the solution that occurs during the time increment influences the derivatives, and therefore the change in the solution depends on itself. The bigger the time increment the stronger the dependence. There are two fundamentally different approaches to finding solutions for problems where the rates of change (in time or space) of a set variables are a complex function of themselves.<br>
The explicit approach is to estimate the future state based solely on the current state and the current rate of change. The simplest approach is known as the “forward Euler method” which is easy to understand and implement, but often unstable for systems with little or no energy loss mechanism, and even when it is stable the solution is usually quite sensitive to the size of the timestep. Higher order methods break the time increment down into sub-steps in order to attempt to account for the “the rate of change of the rates of change”. Such methods include the second order Euler and the fourth order Runge-Kutta method (which is what TUFLOW HPC uses). Explicit methods attempt to capture all of physics in the solution that the equations admit, and often will become unstable if the timestep is not sufficiently small enough to capture the physics accurately. Hence explicit schemes in fluid flow problems have to adjust timestep according to flow velocity (Courant number), wave speed (Celerity number), and diffusion speed (Peclet number).<br>
The implicit approach is to estimate the future state based on the current state and the future rate of change. There are two common approaches to this “circular reference” problem. One is to reformulate the equations as a matrix problem, and the other is to use an iterative approach whereby the future state is repeatedly updated after successive calculations of the rate of change at the future time – also known as a backward implicit scheme. Both approaches can be very stable and enable larger timesteps, but the solution is permitted to “skip over” physics that happens on timescales smaller than the time increment. The timestep must still be appropriate based on Courant, Celerity, and Peclet numbers, but due to the iterative nature of the solution the time step can often be 10-20 times larger than that required for an explicit solution.<br>
<br>
 
= What is the difference between a finite difference scheme and a finite volume scheme? =
A finite difference scheme considers the solution data at discrete points (or nodes) in space and attempts to compute the time derivatives based on the solution data and its spatial derivatives evaluated at those discrete points. While this approach can be relatively simple to implement, the solution is often non-conservative, i.e. the total sum of a certain property (that should be conserved) over all points in the model might increase or decrease slightly from one timestep to the next even in the absence of internal sources or boundary fluxes.<br>
A finite volume scheme uses a mesh that defines interconnected volumes (or cells). The solution data for each cell represents the volume integral (or average) of a conserved property (e.g. mass and momentum) over that cell. The fluxes of the conserved values across cell faces are computed, and the time derivatives for each cell computed according to the total sum of inflows and outflows. The solution scheme is a little more involved to implement, but is guaranteed to be conservative, the model-wise integral of conserved properties remains constant save for internal sources and boundary fluxes.<br>
Tuflow classic is an implicit finite difference scheme. This means that it can use larger timesteps, but can miss short time-scale physics and it is non-conservative. The exact scheme used (Stelling and Syme) becomes reasonably conservative when the timestep is appropriate and the number of convergence iterations are sufficient. However, as the scheme utilises a matrix solution, it requires a particular cell ordering for computations - and this makes it very difficult to parallelise. This is why TUFLOW classic remains a single CPU-core process.<br>
Tuflow HPC utilises an explicit finite volume scheme. This means that it has to use smaller timesteps and is guaranteed to capture the shortest time-scale physics that the given spatial resolution admits. The solution is conserving of mass and momentum to numerical precision. The scheme is not as computationally efficient as the implicit finite difference scheme of TUFLOW classic, if forced to execute on a single CPU core it is many times slower than classic. However, as the cell-by-cell computation of fluxes and derivatives are completely independent, the scheme is well suited to utilise highly parallelised compute hardware such as modern GPUs. The end result is that with a good GPU, TUFLOW HPC can be up to 100 times faster than classic for some models.<br>