Configure CUDA device selection: Difference between revisions

Content deleted Content added
new article on setting CUDA_VISIBLE_DEVICES referenced from the FAQ for Hardware Selection Advice
 
rewrite for a most consistent style
 
(3 intermediate revisions by the same user not shown)
Line 1:
TheComputers computer you use to runrunning TUFLOW may have multiple GPUs. These canmay be multiple NVIDIA GPUs with CUDA- capabilities, which you may want to useused to accelerate runningsimulation your modelsruns. OrAlternatively, they can be additional GPUs used for other purposes likesuch as rendering the interactive desktop foror users of the computer, orhandling other computational tasks. A common occurrence on modern motherboards is the availability of an integrated GPU.
 
Generally, weit recommendis usingrecommended to use a GPU youthat don'tis usenot used for TUFLOW modelling as yourthe primary GPU for rendering the desktop, if needed. If youthere don'tis have anno additional GPU available, you can use one of the NVIDIA GPUs, can be weused, wouldin thenwhich recommendcase usingit theis mostrecommended capableto card asuse the primarymost capable card for running your models, and the secondaryless cardcapable as the primary GPUone for rendering the desktop.
 
TUFLOW allows youselection to selectof a specific GPU for its compute,computation using command -line options likesuch <sourceas enclose="none">-pu0</source> for the first GPU, <source enclose="none">-pu1</source> for the second, etcand so on. (seeSee [[HPC Running and Converting Models]].)
 
However, you may find that what TUFLOW considers the first or second GPU doesmay not match yourthe expectations based on what youorder seeshown in tools likesuch theas Windows Device Manager, Task Manager, or the output fromof <source enclose="none"code>nvidia-smi</sourcecode> on the command line. Another common problem is that the needed GPUs you want to use are not actually #0in andthe #1expected order and you may havecause troubledifficulty selecting the cards you prefer,GPUs in the preferred order you prefer them in.
 
To this end, you can set an environment variable called <source enclose="none"code>CUDA_VISIBLE_DEVICES</sourcecode>, which limits the devices that will be visible to CUDA-capable applications like TUFLOW, as well as specifying the order they will appear in. The restremainder of this article will explainoutlines how to goconfigure aboutthat thatsetting. As an example, we'll use a Windows computer is used that has 2 NVIDIA GPUs, and an on-board AMD GPU. In Windows, youall canavailable listGPUs allcan thebe available GPUslisted using a PowershellPowerShell command like this:
<sourcesyntaxhighlight languagelang="powershell">
Get-CimInstance -Namespace root\cimv2 -ClassName Win32_VideoController | Select-Object DeviceID, Name
</syntaxhighlight>
</source>
(youPowerShell commands can runbe PowerShell commandsrun by opening PowerShell from the Windows Start Menu and pasting a command there)
 
The output for the example computer looksis likeas thisfollows (note that even virtual adapters, likesuch as a Remote Desktop adapter, will showalso appear):
<pre>
DeviceID Name
Line 22:
VideoController4 NVIDIA GeForce RTX 4090
</pre>
In this case, we only need 'VideoController3' and 'VideoController4' need to be visible to CUDA-enabled applications like TUFLOW. We can get moreMore details on those can be obtained by running the following command (from either PowerShell, Command Prompt, or a Linux shell):
<syntaxhighlight lang="batch">
<source>
nvidia-smi --query-gpu=name,uuid --format=csv,noheader,nounits
</syntaxhighlight>
</source>
And the output looksis likeas thisfollows:
<pre>
NVIDIA GeForce RTX 4090, GPU-5060f556-4eb4-7155-4020-abadcb2fd735
NVIDIA GeForce RTX 4090, GPU-f3825978-37f8-b933-5327-583196d560cd
</pre>
The tool won'tdoes not list the AMD card, but up to and including version 2025.1 of TUFLOW, that card may still interfere with yourthe GPU selection order. Also, from this readout, it is not at all clear which card is which and the order here may not match the order you expect from tools like Task Manager ('GPU 0', 'GPU 1', etc.).
 
This isissue what wecan willbe solveresolved by setting the environment variable <source enclose="none"code>CUDA_VISIBLE_DEVICES</sourcecode>. There are two possible formats. It can either have a value like <source enclose="none"code>0,1</sourcecode> or a more explicit value like <source enclose="none"code>GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd</sourcecode> using the identifiers from the <source enclose="none"code>nvidia-smi</sourcecode> output.
 
The short format just affects the default order. If you find using <source enclose="none"code>-pu0</sourcecode> with TUFLOW selects the GPU you'd considerconsidered #1 and vice versa, you could setsetting <source enclose="none"code>CUDA_VISIBLE_DEVICES</sourcecode> to <source enclose="none"code>1,0</sourcecode>, to reversereverses the default order. However, this order may change as you install new hardware is installed, or reinstall existing hardware, soreinstalled. theThe recommendation is to use the explicit values in the long format.
 
You can either set theThe value of the environment variable can either be set at the start of scripts you useused to run your models, like batch files, PowerShell scripts, or Linux shell scripts, or you can set it globally so that it automatically applies to all running applications.
 
In a batch file or from the Command Prompt use this (note there are no quotes around the values, replace the values with the identifiers for yourdetected GPUs):
<sourcesyntaxhighlight languagelang="dos">
SET CUDA_VISIBLE_DEVICES=GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd
</syntaxhighlight>
</source>
In a PowerShell script or from the PowerShell prompt use this (note the quotes around the values, replace the values with the identifiers for yourdetected GPUs):
<sourcesyntaxhighlight languagelang="powershell">
$env:CUDA_VISIBLE_DEVICES = "GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd"
</syntaxhighlight>
</source>
 
If youa prefer toglobally set the value globallyis preferred, youit can either be set it for a single user account by finding "Edit environment variables ''for your account''" in the Windows Start menu and entering the values without quotes, or youit can setbe itset for all users on the machine by finding "Edit the ''system'' environment variables" in the Windows Start menu and doing the same in the 'System Variables' section. Note that youadministrator needrights toare berequired an administrator(elevation) to be able to do the latter.
 
'''Warning:''' setting the value globally affects all CUDA-capable applications, not just TUFLOW. Please ensure that no other applications need the CUDA- capabilities of the GPUs you'rethat leavingare left out or use a local value in your scripts or batch files instead.