Difference between revisions of "TUFLOW crashing"
Line 50: | Line 50: | ||
<br> | <br> | ||
− | =TUFLOW crashes randomly at any time during | + | =TUFLOW crashes randomly at any time during simulation= |
'''Causes''': | '''Causes''': | ||
* The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation. | * The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation. |
Revision as of 13:58, 9 May 2022
This Page is under construction
Introduction
From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.
Troubleshooting Tips
- Use the latest TUFLOW release available at the TUFLOW website.
- If using GPU, update the graphics card driver to the latest version from the Nvidia website.
- Restart the modelling machine.
- Check the end of .tlf file for an error message.
- Test running the model on a different machine.
- Save all outputs (checks, results and logs) to a local drive and use TUFLOW executable saved on a local drive to determine if network is causing the issue.
- Monitor if the issue is happening to a single model only or every model, at specific time during the simulation or randomly.
How to keep simulation console window open
Simulation console window can display errors not written in the .tlf file and provide more information for troubleshooting:
- Insert "pause" at the end of the batch file and remove any Start "TUFLOW", -b and/or -t switch. Simple batch file as below should be used if more complex looping batch file doesn't keep the console window open even with the suggested changes.
- The simulation console window is the one stating with the TUFLOW build, for example "TUFLOW Build: 2020-10-AD-iSP-w64" in the header's top left corner.
- The batch file console window stating "C:\\WINDOWS\system32\cmd.exe" in the header's top left corner won't provide desired information.
"C:\TUFLOW\Releases\2020-10-AD\TUFLOW_iSP_w64.exe" "M01_5m_001.tcf"
- Or, let the console window be written to a text file, e.g. “TUFLOW.exe my_model.tcf > dump.txt”. This will redirect console output messages as well as the standard error stream to the “dump.log” file, and it will likely record more error information than the usual .tlf file.
TUFLOW simulation DOS window only flickers and disappears
Problem might be in the filepath of the TUFLOW executable in the batch file.
Suggestion:
- Check that the executable filepath exists.
- TUFLOW doesn't currently support UNC paths. The folder with the executable has to be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map desired drive.
TCF does not exist
The .tcf name and filepath might be incorrect or unsupported UNC paths were used.
Suggestions:
- Check the name of the .tcf file and filepath (if used) is correct.
- TUFLOW doesn't currently support UNC paths. The folder with the model has to be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map desired drive.
TUFLOW crashes during model initialisation
Causes:
- Erroneous input data or an error in the control files. It might be captured as standard TUFLOW error or as a Fortran compiler error leaving more information only in the console window.
- If multiple large models are initialising at the same time, this could cause a memory overload and stop the simulations.
Suggestions:
- Open the .tlf. The control file/GIS layer written just above the error should be the cause of the issue. Check the file/GIS layer for any unusual setup.
- Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default.
- Insert "timeout xxx" in the batch file between the runs to space out the model most memory demanding initialisation parts, where xxx is a number of seconds to wait for the next run to start.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
TUFLOW crashes randomly at any time during simulation
Causes:
- The modelling machine loses connection to the network when writing any outputs to a network drive and/or using TUFLOW executable located on a network drive. There could be no message box when this happens or a message box stating that TUFLOW.exe has stopped working. Nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation.
- The output drive might not have enough free space seemingly crashing at random time (at any map output interval).
- Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. This would often happen at the similar time during the day or week seemingly stopping TUFLOW simulations at random simulation time. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence. This would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.
Suggestions:
- Unstable network:
- Use TUFLOW executable saved on a local drive.
- Set a local drive as the output drive for all checks, results and logs. This can be done using Output Drive command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example below.
- Insufficient storage:
- Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
- Other processes:
- Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel.
- Run models locally if it is known the runs will coincide with such processes.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
@echo off set TUFLOWEXE_iSP=O:\TUFLOW\Releases\2020-01\2020-10-AA\TUFLOW_iSP_w64.exe set RUN_iSP=start "TUFLOW" /wait "%TUFLOWEXE_iSP%" -b set A=5m 2.5m set B=EXG DEV set source_results=D:\TUFLOW\results set source_log=D:\TUFLOW\runs set destination_results=O:\TUFLOW\support\results set destination_log=O:\TUFLOW\support\runs FOR %%a in (%A%) do ( FOR %%b in (%B%) do ( :: Run model echo Running Cell Size %%a Model Scenario %%b %RUN_iSP% -s1 %%a -s2 %%b M10_~s1~_~s2~_003.tcf :: Move results folder to different location robocopy "%source_results%" "%destination_results%" /e /move :: Move log folder to different location robocopy "%source_log%" "%destination_log%" /e /move timeout 5 ) ) pause
TUFLOW crashes at the same/similar time of the day/week
Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence - this would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.
Suggestions:
- Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce resource use (disk/network), to allow other processes to run in parallel.
- Run models locally if it is known the runs will coincide with such processes. More information on running models locally here.
- Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
TUFLOW crashes at the end of the simulation before writing out maximum results
The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.
Suggestions:
- Ensure storage locations that outputs are written to have enough free space. Depending on the infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
- If you need further help, send the .tlf, .hpc.tlf and snapshot of the console window to support@tuflow.com.
TUFLOW crashes trying to open _ All TUFLOW Simulations.log file
This crash is often incorrectly believed to be caused by a licence issue as the .tlf file ends soon after listing the licences. If the last line in the .tlf is as below, the simulation has stopped because the user doesn't have write permissions to the _ All TUFLOW Simulations.log file or the directory hasn't been created yet.
"Trying to open (A) file C:\<<log_directory>>\_ All TUFLOW Simulations.log...OK. File Unit: 904"
Suggestions:
- Gain write permission to write to the folder from your IT staff.
- Change the default log location to somewhere where you already have write permissions with TCF command "Simulations Log Folder == <<folder>>".
TUFLOW can't find a valid licence
Confirmation that licence is the issue can be mostly found in the .tlf.
Unmaintained dongle
Once a year (usually after mid-year) a new TUFLOW release will need the licence to be updated. This will show in the .tlf file as "Unmaintained since <year>". Follow WIBU Licence Update Request to get the licence updated.
Console window lists all licences, after pressing Enter the licences are released and model doesn't run
What happened here is a licence check which doesn't run a TUFLOW model and doesn't produce a .tlf file. Licence check temporarily takes one licence from every available module until the licences are released by pressing Enter (released immediately) or by closing the console window (released after a couple of minutes). Once licenses are released by pressing Enter, the console window shows “Not licenced or no local licence free”. This means the licences are no longer taken by the licence check and can be used for running models.
The licence check is performed in the following situations:
- Double clicking a TUFLOW executable.
- Running a batch file with only TUFLOW executable specified.
- Running TUFLOW from Notepad++, when inserting only the TUFLOW executable into the Run window.
Check these Wiki pages on how to run TUFLOW model and how to setup running TUFLOW from Notepad++.
Running very old TUFLOW releases
When using an old legacy model with the original TUFLOW release there might be an error "Could not find standalone or network dongle server". Such version of TUFLOW can only run with the old blue softlok licence dongle. If only the metal Wibu dongle is available, DB version of the same year release can be used. We do have some of the old dongles still in possession and can rent it out if required.
More information on TUFLOW licence dongles and which releases are affected: TUFLOW Licensing
No TUFLOW dongle and licence settings files found (.lcf and .dcf)
Some users might mistake this sentence in the .tlf file as there is an issue with the licence. The "No TUFLOW licence settings files found" sentence is followed by "Default settings applied" and the default settings are listed on the next three lines, e.g. WIBU Retry Time, WIBU Retry Count, WIBU Dongles Only. Only if these settings are required to be different, the licence control file (.lcf) or dongle control file (.dcf) would be created. Not having these files doesn't prevent TUFLOW from running, the .tlf should continue past these lines and if there is an error, it would be written as the bottom of the .tlf.