Difference between revisions of "TUFLOW crashing"
(31 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | |||
− | |||
=Introduction= | =Introduction= | ||
From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.<br> | From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.<br> | ||
Line 17: | Line 15: | ||
=How to keep simulation console window open= | =How to keep simulation console window open= | ||
Simulation console window can display errors not written in the .tlf file and provide more information for troubleshooting: | Simulation console window can display errors not written in the .tlf file and provide more information for troubleshooting: | ||
− | * Insert "pause" at the end of the batch file and remove any Start "TUFLOW", -b and/or -t switch. Simple batch file as below should be used if | + | * Insert "pause" at the end of the batch file and remove any Start "TUFLOW", -b and/or -t switch. Simple batch file as below should be used if current batch file doesn't keep the console window open even with the suggested changes. |
− | + | **The simulation console window is the one stating with the TUFLOW build, for example "TUFLOW Build: 2020-10-AD-iSP-w64" in the header's top left corner. | |
− | <pre>"C:\TUFLOW\Releases\2020-10- | + | **The batch file console window stating "C:\WINDOWS\system32\cmd.exe" in the header's top left corner won't provide desired information. |
− | + | <pre>"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf" | |
− | * Or, let the console window be written to a text file | + | pause</pre> |
− | + | * Or, let the console window be written to a text file. This will redirect console output messages as well as the standard error stream to the “dump.txt” file, and it will likely record more error information than the usual .tlf file. | |
+ | <pre>"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf" > dump.txt</pre> | ||
<br> | <br> | ||
− | = | + | =Simulation DOS window flashes and disappears= |
− | + | When batch file is double clicked and the simulation DOW window flashes and disappears the problem might be in the filepath of the TUFLOW executable or incorrect syntax.<br> | |
− | ''' | + | '''Suggestions''': |
− | * Check | + | * Check the TUFLOW executable can be found with the specified filepath (absolute or relative). |
− | * TUFLOW doesn't | + | * Double click the executable, this performs a licence check and DOS window appears. If it doesn't, move the executable to a location where it is permitted to run. Some locations on C drive might be restricted for some users preventing to execute the simulation. |
+ | * TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives. | ||
+ | * If using environment variable 'set exe', confirm there are no spaces surrounding the equals sign (e.g. set exe="..\..\..\..\exe\2020-10-AD\TUFLOW_iSP_w64.exe"). | ||
+ | * Write 'pause' at the end of the script, rerun the batch file and the DOS window should remain open providing more information | ||
<br> | <br> | ||
=TCF does not exist= | =TCF does not exist= | ||
− | The .tcf name | + | The .tcf name or the filepath is incorrect or not supported.<br> |
'''Suggestions:''' | '''Suggestions:''' | ||
− | * Check the name of the .tcf file and filepath ( | + | * Check the name of the .tcf file and filepath (absolute or relative) is correct. |
− | * TUFLOW doesn't | + | * TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives. |
<br> | <br> | ||
=TUFLOW crashes during model initialisation= | =TUFLOW crashes during model initialisation= | ||
− | + | '''Causes''': | |
+ | *Erroneous input data or an error in the control files. It might be captured as standard TUFLOW error or as a Fortran compiler error leaving more information only in the console window. | ||
+ | *If multiple large models are initialising at the same time, this could cause a memory overload and stop the simulations. | ||
'''Suggestions:''' | '''Suggestions:''' | ||
− | * Open the .tlf. The control file/GIS layer written just above the error should be the cause of the issue. Check the file/GIS layer for any unusual setup. | + | * Model input errors: |
− | * | + | ** Open the .tlf. The control file/GIS layer written just above the error should be the cause of the issue. Check the file/GIS layer for any unusual setup. |
− | + | * Memory overload: | |
− | + | ** Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default. | |
− | + | ** Insert "timeout xxx" in the batch file between the runs to space out the model most memory demanding initialisation parts, where xxx is a number of seconds to wait for the next run to start. | |
− | * Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default. | + | * If you need further help, send the .tlf, .hpc.tlf and <u>[[TUFLOW_crashing#How_to_keep_simulation_console_window_open | snapshot of the console window]]</u> to [mailto:support@tuflow.com support@tuflow.com]. |
− | * Insert "timeout xxx" in the batch file between the runs to | ||
− | If | ||
<br> | <br> | ||
− | =TUFLOW crashes randomly at any time during | + | =TUFLOW crashes randomly at any time during simulation= |
− | + | '''Causes''': | |
− | + | * The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation. | |
+ | * The output drive might not have enough free space seemingly crashing at random time (at any map output interval). | ||
+ | * Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. This would often happen at the similar time during the day or week seemingly stopping TUFLOW simulations at random simulation time. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence. This would show as “Handle xxx automatically released. The application is no longer available.” in the <u>[[WIBU_create_cmDust | cmDust file]]</u>. | ||
'''Suggestions''': | '''Suggestions''': | ||
− | * Use TUFLOW executable saved on a local drive. | + | * Unstable network: |
− | * Set a local drive as the output drive for all checks, results and logs. This can be done using <font color="blue"><tt>Output Drive</tt></font> command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example | + | ** Use TUFLOW executable saved on a local drive. |
− | * Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full. | + | ** Set a local drive as the output drive for all checks, results and logs. This can be done using <font color="blue"><tt>Output Drive</tt></font> command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example <u>[[TUFLOW_crashing#Robocopy_example_to_automatically_copy_outputs_on_network_drive_after_model_runs | here]]</u>.<br> |
− | + | * Insufficient storage: | |
− | + | ** Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | *Other processes: | |
− | + | ** Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel. | |
− | + | ** Run models locally if it is known the runs will coincide with such processes. | |
− | * Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel. | + | * If you need further help, send the .tlf, .hpc.tlf and <u>[[TUFLOW_crashing#How_to_keep_simulation_console_window_open | snapshot of the console window]]</u> to [mailto:support@tuflow.com support@tuflow.com]. |
− | * Run models locally if it is known the runs will coincide with such processes | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | * If you need further help, send the .tlf, hpc.tlf and <u>[[TUFLOW_crashing#How_to_keep_simulation_console_window_open | snapshot of the console window]]</u> to support@tuflow.com. | ||
<br> | <br> | ||
Line 108: | Line 75: | ||
The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.<br> | The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.<br> | ||
'''Suggestions:''' | '''Suggestions:''' | ||
− | * Ensure storage locations that outputs are written to have | + | * Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full. |
− | * If | + | * If you need further help, send the .tlf, .hpc.tlf and <u>[[TUFLOW_crashing#How_to_keep_simulation_console_window_open | snapshot of the console window]]</u> to [mailto:support@tuflow.com support@tuflow.com]. |
<br> | <br> | ||
Line 117: | Line 84: | ||
'''Suggestions''': | '''Suggestions''': | ||
* Gain write permission to write to the folder from your IT staff. | * Gain write permission to write to the folder from your IT staff. | ||
− | * Change the default log location to somewhere where you already have write permissions with TCF command "Simulations Log Folder == <<folder>> | + | * Change the default log location to somewhere where you already have write permissions with TCF command <font color="blue"><tt>Simulations Log Folder </tt></font> <font color="red"><tt>== </tt></font> <tt><<folder>></tt>. |
<br> | <br> | ||
Line 142: | Line 109: | ||
==Technical licence issues== | ==Technical licence issues== | ||
− | <u>[[Wibu_Dongles#Troubleshooting | Wibu Dongles Troubleshooting]]</u> | + | <u>[[Wibu_Dongles#Troubleshooting | Wibu Dongles Troubleshooting]]</u><br> |
+ | <br> | ||
+ | |||
+ | =Robocopy example to automatically copy outputs on network drive after model runs= | ||
+ | <pre>@echo off | ||
+ | |||
+ | set TUFLOWEXE_iSP=O:\TUFLOW\Releases\2020-01\2020-10-AA\TUFLOW_iSP_w64.exe | ||
+ | set RUN_iSP=start "TUFLOW" /wait "%TUFLOWEXE_iSP%" -b | ||
+ | |||
+ | set A=5m 2.5m | ||
+ | set B=EXG DEV | ||
+ | set source_results=D:\TUFLOW\results | ||
+ | set source_log=D:\TUFLOW\runs | ||
+ | set destination_results=O:\TUFLOW\support\results | ||
+ | set destination_log=O:\TUFLOW\support\runs | ||
+ | |||
+ | FOR %%a in (%A%) do ( | ||
+ | FOR %%b in (%B%) do ( | ||
+ | :: Run model | ||
+ | echo Running Cell Size %%a Model Scenario %%b | ||
+ | %RUN_iSP% -s1 %%a -s2 %%b M10_~s1~_~s2~_003.tcf | ||
+ | |||
+ | :: Move results folder to different location | ||
+ | robocopy "%source_results%" "%destination_results%" /e /move | ||
+ | |||
+ | :: Move log folder to different location | ||
+ | robocopy "%source_log%" "%destination_log%" /e /move | ||
+ | timeout 5 | ||
+ | ) | ||
+ | ) | ||
+ | pause</pre> |
Latest revision as of 18:29, 12 July 2023
Introduction
From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.
Troubleshooting Tips
- Use the latest TUFLOW release available at the TUFLOW website.
- If using GPU, update the graphics card driver to the latest version from the Nvidia website.
- Restart the modelling machine.
- Check the end of .tlf file for an error message.
- Test running the model on a different machine.
- Save all outputs (checks, results and logs) to a local drive and use TUFLOW executable saved on a local drive to determine if network is causing the issue.
- Monitor if the issue is happening to a single model only or every model, at specific time during the simulation or randomly.
How to keep simulation console window open
Simulation console window can display errors not written in the .tlf file and provide more information for troubleshooting:
- Insert "pause" at the end of the batch file and remove any Start "TUFLOW", -b and/or -t switch. Simple batch file as below should be used if current batch file doesn't keep the console window open even with the suggested changes.
- The simulation console window is the one stating with the TUFLOW build, for example "TUFLOW Build: 2020-10-AD-iSP-w64" in the header's top left corner.
- The batch file console window stating "C:\WINDOWS\system32\cmd.exe" in the header's top left corner won't provide desired information.
"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf" pause
- Or, let the console window be written to a text file. This will redirect console output messages as well as the standard error stream to the “dump.txt” file, and it will likely record more error information than the usual .tlf file.
"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf" > dump.txt
Simulation DOS window flashes and disappears
When batch file is double clicked and the simulation DOW window flashes and disappears the problem might be in the filepath of the TUFLOW executable or incorrect syntax.
Suggestions:
- Check the TUFLOW executable can be found with the specified filepath (absolute or relative).
- Double click the executable, this performs a licence check and DOS window appears. If it doesn't, move the executable to a location where it is permitted to run. Some locations on C drive might be restricted for some users preventing to execute the simulation.
- TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives.
- If using environment variable 'set exe', confirm there are no spaces surrounding the equals sign (e.g. set exe="..\..\..\..\exe\2020-10-AD\TUFLOW_iSP_w64.exe").
- Write 'pause' at the end of the script, rerun the batch file and the DOS window should remain open providing more information
TCF does not exist
The .tcf name or the filepath is incorrect or not supported.
Suggestions:
- Check the name of the .tcf file and filepath (absolute or relative) is correct.
- TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives.
TUFLOW crashes during model initialisation
Causes:
- Erroneous input data or an error in the control files. It might be captured as standard TUFLOW error or as a Fortran compiler error leaving more information only in the console window.
- If multiple large models are initialising at the same time, this could cause a memory overload and stop the simulations.
Suggestions:
- Model input errors:
- Open the .tlf. The control file/GIS layer written just above the error should be the cause of the issue. Check the file/GIS layer for any unusual setup.
- Memory overload:
- Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default.
- Insert "timeout xxx" in the batch file between the runs to space out the model most memory demanding initialisation parts, where xxx is a number of seconds to wait for the next run to start.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
TUFLOW crashes randomly at any time during simulation
Causes:
- The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation.
- The output drive might not have enough free space seemingly crashing at random time (at any map output interval).
- Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. This would often happen at the similar time during the day or week seemingly stopping TUFLOW simulations at random simulation time. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence. This would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.
Suggestions:
- Unstable network:
- Use TUFLOW executable saved on a local drive.
- Set a local drive as the output drive for all checks, results and logs. This can be done using Output Drive command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example here.
- Insufficient storage:
- Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
- Other processes:
- Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel.
- Run models locally if it is known the runs will coincide with such processes.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
TUFLOW crashes at the end of the simulation before writing out maximum results
The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.
Suggestions:
- Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
TUFLOW crashes trying to open _ All TUFLOW Simulations.log file
This crash is often incorrectly believed to be caused by a licence issue as the .tlf file ends soon after listing the licences. If the last line in the .tlf is as below, the simulation has stopped because the user doesn't have write permissions to the _ All TUFLOW Simulations.log file or the directory hasn't been created yet.
"Trying to open (A) file C:\<<log_directory>>\_ All TUFLOW Simulations.log...OK. File Unit: 904"
Suggestions:
- Gain write permission to write to the folder from your IT staff.
- Change the default log location to somewhere where you already have write permissions with TCF command Simulations Log Folder == <<folder>>.
TUFLOW can't find a valid licence
Confirmation that licence is the issue can be mostly found in the .tlf.
Unmaintained dongle
Once a year (usually after mid-year) a new TUFLOW release will need the licence to be updated. This will show in the .tlf file as "Unmaintained since <year>". Follow WIBU Licence Update Request to get the licence updated.
Console window lists all licences, after pressing Enter the licences are released and model doesn't run
What happened here is a licence check which doesn't run a TUFLOW model and doesn't produce a .tlf file. Licence check temporarily takes one licence from every available module until the licences are released by pressing Enter (released immediately) or by closing the console window (released after a couple of minutes). Once licenses are released by pressing Enter, the console window shows “Not licenced or no local licence free”. This means the licences are no longer taken by the licence check and can be used for running models.
The licence check is performed in the following situations:
- Double clicking a TUFLOW executable.
- Running a batch file with only TUFLOW executable specified.
- Running TUFLOW from Notepad++, when inserting only the TUFLOW executable into the Run window.
Check these Wiki pages on how to run TUFLOW model and how to setup running TUFLOW from Notepad++.
Running very old TUFLOW releases
When using an old legacy model with the original TUFLOW release there might be an error "Could not find standalone or network dongle server". Such version of TUFLOW can only run with the old blue softlok licence dongle. If only the metal Wibu dongle is available, DB version of the same year release can be used. We do have some of the old dongles still in possession and can rent it out if required.
More information on TUFLOW licence dongles and which releases are affected: TUFLOW Licensing
No TUFLOW dongle and licence settings files found (.lcf and .dcf)
Some users might mistake this sentence in the .tlf file as there is an issue with the licence. The "No TUFLOW licence settings files found" sentence is followed by "Default settings applied" and the default settings are listed on the next three lines, e.g. WIBU Retry Time, WIBU Retry Count, WIBU Dongles Only. Only if these settings are required to be different, the licence control file (.lcf) or dongle control file (.dcf) would be created. Not having these files doesn't prevent TUFLOW from running, the .tlf should continue past these lines and if there is an error, it would be written as the bottom of the .tlf.
Technical licence issues
Robocopy example to automatically copy outputs on network drive after model runs
@echo off set TUFLOWEXE_iSP=O:\TUFLOW\Releases\2020-01\2020-10-AA\TUFLOW_iSP_w64.exe set RUN_iSP=start "TUFLOW" /wait "%TUFLOWEXE_iSP%" -b set A=5m 2.5m set B=EXG DEV set source_results=D:\TUFLOW\results set source_log=D:\TUFLOW\runs set destination_results=O:\TUFLOW\support\results set destination_log=O:\TUFLOW\support\runs FOR %%a in (%A%) do ( FOR %%b in (%B%) do ( :: Run model echo Running Cell Size %%a Model Scenario %%b %RUN_iSP% -s1 %%a -s2 %%b M10_~s1~_~s2~_003.tcf :: Move results folder to different location robocopy "%source_results%" "%destination_results%" /e /move :: Move log folder to different location robocopy "%source_log%" "%destination_log%" /e /move timeout 5 ) ) pause