Configure CUDA device selection: Difference between revisions

Content deleted Content added

Inline

Revision as of 12:22, 18 June 2025

The computer you use to run TUFLOW may have multiple GPUs. These can be multiple NVIDIA GPUs with CUDA-capabilities, which you may want to use to accelerate running your models. Or they can be additional GPUs for other purposes like rendering the interactive desktop for users of the computer, or other computational tasks. A common occurrence on modern motherboards is the availability of an integrated GPU.

Generally, we recommend using a GPU you don't use for TUFLOW modelling as your primary GPU for rendering the desktop, if needed. If you don't have an additional GPU available, you can use one of the NVIDIA GPUs, be we would then recommend using the most capable card as the primary card for running your models, and the secondary card as the primary GPU for rendering the desktop.

TUFLOW allows you to select a specific GPU for its compute, using command line options like -pu0 for the first GPU, -pu1 for the second, etc. (see HPC Running and Converting Models)

However, you may find that what TUFLOW considers the first or second GPU does not match your expectations based on what you see in tools like the Windows Device Manager, Task Manager, or the output from nvidia-smi on the command line. Another common problem is that the GPUs you want to use are not actually #0 and #1 and you may have trouble selecting the cards you prefer, in the order you prefer them in.

To this end, you can set an environment variable called CUDA_VISIBLE_DEVICES, which limits the devices that will be visible to CUDA-capable applications like TUFLOW, as well as specifying the order they will appear in. The rest of this article will explain how to go about that. As an example, we'll use a Windows computer that has 2 NVIDIA GPUs, and an on-board AMD GPU. In Windows, you can list all the available GPUs using a Powershell command like this:

Get-CimInstance -Namespace root\cimv2 -ClassName Win32_VideoController | Select-Object DeviceID, Name

(you can run PowerShell commands by opening PowerShell from the Windows Start Menu and pasting a command there)

The output for the example computer looks like this (note that even virtual adapters like a Remote Desktop adapter will show):

DeviceID         Name
--------         ----
VideoController1 AMD Radeon(TM) Graphics
VideoController2 Microsoft Remote Display Adapter
VideoController3 NVIDIA GeForce RTX 4090
VideoController4 NVIDIA GeForce RTX 4090

In this case, we only need 'VideoController3' and 'VideoController4' to be visible to CUDA-enabled applications like TUFLOW. We can get more details on those by running the following command (from either PowerShell, Command Prompt, or a Linux shell):

nvidia-smi --query-gpu=name,uuid --format=csv,noheader,nounits

And the output looks like this:

NVIDIA GeForce RTX 4090, GPU-5060f556-4eb4-7155-4020-abadcb2fd735
NVIDIA GeForce RTX 4090, GPU-f3825978-37f8-b933-5327-583196d560cd

The tool won't list the AMD card, but up to and including version 2025.1 of TUFLOW, that card may still interfere with your GPU selection order. Also, from this readout, it is not at all clear which card is which and the order here may not match the order you expect from tools like Task Manager ('GPU 0', 'GPU 1', etc.).

This is what we will solve by setting the environment variable CUDA_VISIBLE_DEVICES. There are two possible formats. It can either have a value like <code=>0,1 or a more explicit value like GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd using the identifiers from the nvidia-smi output.

The short format just affects the default order. If you find using -pu0 with TUFLOW selects the GPU you'd consider #1 and vice versa, you could set CUDA_VISIBLE_DEVICES to <code="none">1,0, to reverse the default order. However, this order may change as you install new hardware or reinstall existing hardware, so the recommendation is to use the explicit values in the long format.

You can either set the value of the environment variable at the start of scripts you use to run your models, like batch files, PowerShell scripts, or Linux shell scripts, or you can set it globally so that it automatically applies to all running applications.

In a batch file or from the Command Prompt use this (note there are no quotes around the values, replace the values with the identifiers for your GPUs):

SET CUDA_VISIBLE_DEVICES=GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd

In a PowerShell script or from the PowerShell prompt use this (note the quotes around the values, replace the values with the identifiers for your GPUs):

$env:CUDA_VISIBLE_DEVICES = "GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd"

If you prefer to set the value globally, you can either set it for a single user account by finding "Edit environment variables for your account" in the Windows Start menu and entering the values without quotes, or you can set it for all users on the machine by finding "Edit the system environment variables" in the Windows Start menu and doing the same in the 'System Variables' section. Note that you need to be an administrator to be able to do the latter.

Warning: setting the value globally affects all CUDA-capable applications, not just TUFLOW. Please ensure that no other applications need the CUDA-capabilities of the GPUs you're leaving out or use a local value in your scripts or batch files instead.

@@ Line 3: / Line 3: @@
 Generally, we recommend using a GPU you don't use for TUFLOW modelling as your primary GPU for rendering the desktop, if needed. If you don't have an additional GPU available, you can use one of the NVIDIA GPUs, be we would then recommend using the most capable card as the primary card for running your models, and the secondary card as the primary GPU for rendering the desktop.
-TUFLOW allows you to select a specific GPU for its compute, using command line options like <source enclose="none">-pu0</source> for the first GPU, <source enclose="none">-pu1</source> for the second, etc. (see [[HPC Running and Converting Models]])
+TUFLOW allows you to select a specific GPU for its compute, using command line options like <code>-pu0</code> for the first GPU, <code>-pu1</code> for the second, etc. (see [[HPC Running and Converting Models]])
-However, you may find that what TUFLOW considers the first or second GPU does not match your expectations based on what you see in tools like the Windows Device Manager, Task Manager, or the output from <source enclose="none">nvidia-smi</source> on the command line. Another common problem is that the GPUs you want to use are not actually #0 and #1 and you may have trouble selecting the cards you prefer, in the order you prefer them in.
+However, you may find that what TUFLOW considers the first or second GPU does not match your expectations based on what you see in tools like the Windows Device Manager, Task Manager, or the output from <code>nvidia-smi</code> on the command line. Another common problem is that the GPUs you want to use are not actually #0 and #1 and you may have trouble selecting the cards you prefer, in the order you prefer them in.
-To this end, you can set an environment variable called <source enclose="none">CUDA_VISIBLE_DEVICES</source>, which limits the devices that will be visible to CUDA-capable applications like TUFLOW, as well as specifying the order they will appear in. The rest of this article will explain how to go about that. As an example, we'll use a Windows computer that has 2 NVIDIA GPUs, and an on-board AMD GPU. In Windows, you can list all the available GPUs using a Powershell command like this:
+To this end, you can set an environment variable called <code>CUDA_VISIBLE_DEVICES</code>, which limits the devices that will be visible to CUDA-capable applications like TUFLOW, as well as specifying the order they will appear in. The rest of this article will explain how to go about that. As an example, we'll use a Windows computer that has 2 NVIDIA GPUs, and an on-board AMD GPU. In Windows, you can list all the available GPUs using a Powershell command like this:
-<source language="powershell">
+<syntaxhighlight lang="powershell">
 Get-CimInstance -Namespace root\cimv2 -ClassName Win32_VideoController | Select-Object DeviceID, Name
+</syntaxhighlight>
-</source>
 (you can run PowerShell commands by opening PowerShell from the Windows Start Menu and pasting a command there)
@@ Line 23: / Line 23: @@
 </pre>
 In this case, we only need 'VideoController3' and 'VideoController4' to be visible to CUDA-enabled applications like TUFLOW. We can get more details on those by running the following command (from either PowerShell, Command Prompt, or a Linux shell):
+<syntaxhighlight>
-<source>
 nvidia-smi --query-gpu=name,uuid --format=csv,noheader,nounits
+</syntaxhighlight>
-</source>
 And the output looks like this:
 <pre>
@@ Line 33: / Line 33: @@
 The tool won't list the AMD card, but up to and including version 2025.1 of TUFLOW, that card may still interfere with your GPU selection order. Also, from this readout, it is not at all clear which card is which and the order here may not match the order you expect from tools like Task Manager ('GPU 0', 'GPU 1', etc.).
-This is what we will solve by setting the environment variable <source enclose="none">CUDA_VISIBLE_DEVICES</source>. There are two possible formats. It can either have a value like <source enclose="none">0,1</source> or a more explicit value like <source enclose="none">GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd</source> using the identifiers from the <source enclose="none">nvidia-smi</source> output.
+This is what we will solve by setting the environment variable <code>CUDA_VISIBLE_DEVICES</code>. There are two possible formats. It can either have a value like <code=>0,1</code> or a more explicit value like <code>GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd</code> using the identifiers from the <code>nvidia-smi</code> output.
-The short format just affects the default order. If you find using <source enclose="none">-pu0</source> with TUFLOW selects the GPU you'd consider #1 and vice versa, you could set <source enclose="none">CUDA_VISIBLE_DEVICES</source> to <source enclose="none">1,0</source>, to reverse the default order. However, this order may change as you install new hardware or reinstall existing hardware, so the recommendation is to use the explicit values in the long format.
+The short format just affects the default order. If you find using <code>-pu0</code> with TUFLOW selects the GPU you'd consider #1 and vice versa, you could set <code>CUDA_VISIBLE_DEVICES</code> to <code="none">1,0</code>, to reverse the default order. However, this order may change as you install new hardware or reinstall existing hardware, so the recommendation is to use the explicit values in the long format.
 You can either set the value of the environment variable at the start of scripts you use to run your models, like batch files, PowerShell scripts, or Linux shell scripts, or you can set it globally so that it automatically applies to all running applications.
 In a batch file or from the Command Prompt use this (note there are no quotes around the values, replace the values with the identifiers for your GPUs):
-<source language="dos">
+<syntaxhighlight lang="dos">
 SET CUDA_VISIBLE_DEVICES=GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd
+</syntaxhighlight>
-</source>
 In a PowerShell script or from the PowerShell prompt use this (note the quotes around the values, replace the values with the identifiers for your GPUs):
-<source language="powershell">
+<syntaxhighlight lang="powershell">
 $env:CUDA_VISIBLE_DEVICES = "GPU-5060f556-4eb4-7155-4020-abadcb2fd735,GPU-f3825978-37f8-b933-5327-583196d560cd"
+</syntaxhighlight>
-</source>
 If you prefer to set the value globally, you can either set it for a single user account by finding "Edit environment variables ''for your account''" in the Windows Start menu and entering the values without quotes, or you can set it for all users on the machine by finding "Edit the ''system'' environment variables" in the Windows Start menu and doing the same in the 'System Variables' section. Note that you need to be an administrator to be able to do the latter.

Configure CUDA device selection: Difference between revisions

Revision as of 12:22, 18 June 2025

Navigation menu

Search