TUFLOW crashing

From Tuflow
Revision as of 18:29, 12 July 2023 by ElizaCollison (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.

Troubleshooting Tips

  • Use the latest TUFLOW release available at the TUFLOW website.
  • If using GPU, update the graphics card driver to the latest version from the Nvidia website.
  • Restart the modelling machine.
  • Check the end of .tlf file for an error message.
  • Test running the model on a different machine.
  • Save all outputs (checks, results and logs) to a local drive and use TUFLOW executable saved on a local drive to determine if network is causing the issue.
  • Monitor if the issue is happening to a single model only or every model, at specific time during the simulation or randomly.


How to keep simulation console window open

Simulation console window can display errors not written in the .tlf file and provide more information for troubleshooting:

  • Insert "pause" at the end of the batch file and remove any Start "TUFLOW", -b and/or -t switch. Simple batch file as below should be used if current batch file doesn't keep the console window open even with the suggested changes.
    • The simulation console window is the one stating with the TUFLOW build, for example "TUFLOW Build: 2020-10-AD-iSP-w64" in the header's top left corner.
    • The batch file console window stating "C:\WINDOWS\system32\cmd.exe" in the header's top left corner won't provide desired information.
"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf"
pause
  • Or, let the console window be written to a text file. This will redirect console output messages as well as the standard error stream to the “dump.txt” file, and it will likely record more error information than the usual .tlf file.
"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf" > dump.txt


Simulation DOS window flashes and disappears

When batch file is double clicked and the simulation DOW window flashes and disappears the problem might be in the filepath of the TUFLOW executable or incorrect syntax.
Suggestions:

  • Check the TUFLOW executable can be found with the specified filepath (absolute or relative).
  • Double click the executable, this performs a licence check and DOS window appears. If it doesn't, move the executable to a location where it is permitted to run. Some locations on C drive might be restricted for some users preventing to execute the simulation.
  • TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives.
  • If using environment variable 'set exe', confirm there are no spaces surrounding the equals sign (e.g. set exe="..\..\..\..\exe\2020-10-AD\TUFLOW_iSP_w64.exe").
  • Write 'pause' at the end of the script, rerun the batch file and the DOS window should remain open providing more information


TCF does not exist

The .tcf name or the filepath is incorrect or not supported.
Suggestions:

  • Check the name of the .tcf file and filepath (absolute or relative) is correct.
  • TUFLOW doesn't run from a batch file if the filepaths are specified as UNC paths. The folder with both, the executable and the model, must be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map the drives.


TUFLOW crashes during model initialisation

Causes:

  • Erroneous input data or an error in the control files. It might be captured as standard TUFLOW error or as a Fortran compiler error leaving more information only in the console window.
  • If multiple large models are initialising at the same time, this could cause a memory overload and stop the simulations.

Suggestions:

  • Model input errors:
    • Open the .tlf. The control file/GIS layer written just above the error should be the cause of the issue. Check the file/GIS layer for any unusual setup.
  • Memory overload:
    • Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default.
    • Insert "timeout xxx" in the batch file between the runs to space out the model most memory demanding initialisation parts, where xxx is a number of seconds to wait for the next run to start.
  • If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.


TUFLOW crashes randomly at any time during simulation

Causes:

  • The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation.
  • The output drive might not have enough free space seemingly crashing at random time (at any map output interval).
  • Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. This would often happen at the similar time during the day or week seemingly stopping TUFLOW simulations at random simulation time. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence. This would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.

Suggestions:

  • Unstable network:
    • Use TUFLOW executable saved on a local drive.
    • Set a local drive as the output drive for all checks, results and logs. This can be done using Output Drive command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example here.
  • Insufficient storage:
    • Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
  • Other processes:
    • Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel.
    • Run models locally if it is known the runs will coincide with such processes.
  • If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.


TUFLOW crashes at the end of the simulation before writing out maximum results

The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.
Suggestions:

  • Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
  • If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.


TUFLOW crashes trying to open _ All TUFLOW Simulations.log file

This crash is often incorrectly believed to be caused by a licence issue as the .tlf file ends soon after listing the licences. If the last line in the .tlf is as below, the simulation has stopped because the user doesn't have write permissions to the _ All TUFLOW Simulations.log file or the directory hasn't been created yet.

"Trying to open (A) file C:\<<log_directory>>\_ All TUFLOW Simulations.log...OK.  File Unit: 904"

Suggestions:

  • Gain write permission to write to the folder from your IT staff.
  • Change the default log location to somewhere where you already have write permissions with TCF command Simulations Log Folder == <<folder>>.


TUFLOW can't find a valid licence

Confirmation that licence is the issue can be mostly found in the .tlf.

Unmaintained dongle

Once a year (usually after mid-year) a new TUFLOW release will need the licence to be updated. This will show in the .tlf file as "Unmaintained since <year>". Follow WIBU Licence Update Request to get the licence updated.

Console window lists all licences, after pressing Enter the licences are released and model doesn't run

What happened here is a licence check which doesn't run a TUFLOW model and doesn't produce a .tlf file. Licence check temporarily takes one licence from every available module until the licences are released by pressing Enter (released immediately) or by closing the console window (released after a couple of minutes). Once licenses are released by pressing Enter, the console window shows “Not licenced or no local licence free”. This means the licences are no longer taken by the licence check and can be used for running models.
The licence check is performed in the following situations:

  • Double clicking a TUFLOW executable.
  • Running a batch file with only TUFLOW executable specified.
  • Running TUFLOW from Notepad++, when inserting only the TUFLOW executable into the Run window.

Check these Wiki pages on how to run TUFLOW model and how to setup running TUFLOW from Notepad++.

Running very old TUFLOW releases

When using an old legacy model with the original TUFLOW release there might be an error "Could not find standalone or network dongle server". Such version of TUFLOW can only run with the old blue softlok licence dongle. If only the metal Wibu dongle is available, DB version of the same year release can be used. We do have some of the old dongles still in possession and can rent it out if required.
More information on TUFLOW licence dongles and which releases are affected: TUFLOW Licensing

No TUFLOW dongle and licence settings files found (.lcf and .dcf)

Some users might mistake this sentence in the .tlf file as there is an issue with the licence. The "No TUFLOW licence settings files found" sentence is followed by "Default settings applied" and the default settings are listed on the next three lines, e.g. WIBU Retry Time, WIBU Retry Count, WIBU Dongles Only. Only if these settings are required to be different, the licence control file (.lcf) or dongle control file (.dcf) would be created. Not having these files doesn't prevent TUFLOW from running, the .tlf should continue past these lines and if there is an error, it would be written as the bottom of the .tlf.

Technical licence issues

Wibu Dongles Troubleshooting

Robocopy example to automatically copy outputs on network drive after model runs

@echo off

set TUFLOWEXE_iSP=O:\TUFLOW\Releases\2020-01\2020-10-AA\TUFLOW_iSP_w64.exe
set RUN_iSP=start "TUFLOW" /wait "%TUFLOWEXE_iSP%" -b

set A=5m 2.5m
set B=EXG DEV
set source_results=D:\TUFLOW\results
set source_log=D:\TUFLOW\runs
set destination_results=O:\TUFLOW\support\results
set destination_log=O:\TUFLOW\support\runs

FOR %%a in (%A%) do (
	FOR %%b in (%B%) do (
		:: Run model
		echo Running Cell Size %%a Model Scenario %%b
		%RUN_iSP% -s1 %%a -s2 %%b M10_~s1~_~s2~_003.tcf
		
	:: Move results folder to different location
        robocopy "%source_results%" "%destination_results%" /e /move
        
        :: Move log folder to different location
        robocopy "%source_log%" "%destination_log%" /e /move
        timeout 5
	)
)
pause