Difference between revisions of "TUFLOW crashing"

From Tuflow
Jump to navigation Jump to search
(Some rewording around crashes)
Line 80: Line 80:
 
Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence - this would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.<br>
 
Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence - this would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.<br>
 
'''Suggestions:'''
 
'''Suggestions:'''
* Check with IT whether such processes are occurring and if a reasonable limit/preferences can be set to allow other processes to run in parallel.
+
* Ensure storage locations that results are written to have ample free space. Depending on your infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
* Run models locally if it it known the runs will coincide with such processes. More information on running models locally <u>[[TUFLOW_crashing#TUFLOW_crashes_randomly_at_any_time_during_the_simulation | here]]</u>.
+
* Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce (disk/network) resource use, to allow other processes to run in parallel.
 +
* Run models locally if it is known the runs will coincide with such processes. More information on running models locally <u>[[TUFLOW_crashing#TUFLOW_crashes_randomly_at_any_time_during_the_simulation | here]]</u>.
 
<br>
 
<br>
  

Revision as of 16:19, 15 February 2021

This Page is under construction

Introduction

From time to time TUFLOW simulations can crash. There are multiple reasons why it could happen. This page can be used as a guide to find and rectify the cause of the crash.

Troubleshooting Tips

  • Use the latest TUFLOW release available at the TUFLOW website.
  • If using GPU, update the graphics card driver to the latest version from the Nvidia website.
  • Restart the modelling machine.
  • Check the end of .tlf file for an error message.
  • Test running the model on a different machine.
  • Save all outputs (checks, results and logs) to a local drive and use TUFLOW executable saved on a local drive to determine if network is causing the issue.
  • Monitor if the issue is happening to a single model only or every model, at specific time during the simulation or randomly.


TUFLOW simulation DOS window only flicks and disappears

Problem might be in the filepath of the TUFLOW executable in the batch file.
Suggestion:

  • Check that the executable filepath exists.
  • TUFLOW doesn't currently support UNC paths. The folder with the executable has to be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map desired drive.


TCF does not exist

The .tcf name and filepath might be incorrect or unsupported UNC paths were used.
Suggestions:

  • Check the name of the .tcf file and filepath (if used) is correct.
  • TUFLOW doesn't currently support UNC paths. The folder with the model has to be opened with a mapped drive. Type "net use <drive>: \\server_name\share_name" in the command line to map desired drive.


TUFLOW crashes during model initialisation

Crash at the start of the model might be connected to an erroneous input data or an error in the control files. It might be captured as standard TUFLOW error or as a Fortran compiler error leaving messages only in the console window.
Suggestions:

  • Insert "pause" at the end of the batch file to keep the console window open. The control file/GIS layer written just above the error should be the cause of the issue.
  • Let the console window be written to a text file, e.g. “TUFLOW.exe my_model.tcf > dump.txt”. This will redirect console output massages as well as the standard error stream to the “dump.log” file, and likely it will record more error information than the usual TUFLOW log file.

If multiple large models are initialising at the same time, this could cause a memory overload and stop the simulations.
Suggestions:

  • Display "Peak working set (memory)" in the Task Manager to confirm memory overload, it is not displayed by default.
  • Insert "timeout xxx" in the batch file between the runs to allow the models to initialise separately, where xxx is a number of seconds to wait for the next run to start.

If there is no clear indication of what the cause might be, send a snapshot of the console window and .tlf to support@tuflow.com.

TUFLOW crashes randomly at any time during the simulation

This can happen when TUFLOW is writing outputs to a network drive and/or model uses TUFLOW executable located on a network drive and the computer loses connection to the network. There could be no message box when this happens or a window stating that TUFLOW.exe has stopped working and nothing error related will be written to the .tlf file. If multiple models were running simultaneously, all unfinished models would crash. This applies to Windows 10, 8, 7 operating system. With Windows XP, the simulation would only pause and restart itself when the access to the network drive is back on. The difference in the behaviour is unfortunately based on the operating system and as far as we are aware we are unable to do anything within the TUFLOW code to handle this situation.
Suggestions:

  • Use TUFLOW executable saved on a local drive.
  • Set a local drive as the output drive for all checks, results and logs. This can be done using Output Drive command in the .tcf. All outputs can be copied back to the network drive at the simulation end if required. Robocopy can be added to the end of the .bat file, example below.

If the issue persist after using all the suggestions, contact support@tuflow.com, attaching the .tlf file and snapshot of the console window.

@echo off

set TUFLOWEXE_iSP=O:\TUFLOW\Releases\2020-01\2020-10-AA\TUFLOW_iSP_w64.exe
set RUN_iSP=start "TUFLOW" /wait "%TUFLOWEXE_iSP%" -b

set A=5m 2.5m
set B=EXG DEV
set source_results=D:\TUFLOW\results
set source_log=D:\TUFLOW\runs
set destination_results=O:\TUFLOW\support\results
set destination_log=O:\TUFLOW\support\runs

FOR %%a in (%A%) do (
	FOR %%b in (%B%) do (
		:: Run model
		echo Running Cell Size %%a Model Scenario %%b
		%RUN_iSP% -s1 %%a -s2 %%b M10_~s1~_~s2~_003.tcf
		
		:: Move results folder to different location
        robocopy "%source_results%" "%destination_results%" /e /move
        
        :: Move log folder to different location
        robocopy "%source_log%" "%destination_log%" /e /move
        timeout 5
	)
)
pause

TUFLOW is crashing at the same/similar time of the day/week

Something might be preventing the simulation from writing to network drive such as scheduled backup, updates, restart, deduplication or other scheduled processes. TUFLOW is not releasing the licence at the end of the simulation, rather CodeMeter system is determining that the TUFLOW application is no longer running and is therefore releasing the licence - this would show as “Handle xxx automatically released. The application is no longer available.” in the cmDust file.
Suggestions:

  • Ensure storage locations that results are written to have ample free space. Depending on your infrastructure, reported free space may not be accurate, or performance of storage may substantially degrade as it nears being full.
  • Check with IT whether such processes are occurring and if reasonable limits/preferences/priorities can be set to reduce (disk/network) resource use, to allow other processes to run in parallel.
  • Run models locally if it is known the runs will coincide with such processes. More information on running models locally here.


TUFLOW crashes at the end of the simulation before writing out maximum results

The output drive might not have enough free space to write out the full results. If multiple TUFLOW simulations run in parallel, the free space might be filling up faster than expected. There can also be other processes filling up the drive, e.g. other software, backups, other users copying data. The target drive should never be planned to be filled to the brim, as performance will suffer for all processes.
Suggestions:

  • Make sure there is enough free space on the output drive.

If the output drive does have more than enough space, insert "pause" at the end of the batch file to keep the console window open and send a snapshot of this window and .tlf to support@tuflow.com.

TUFLOW can't find a valid licence

Confirmation that licence is the issue can be mostly found in the .tlf.

Unmaintained dongle

Once a year (usually after mid-year) a new TUFLOW release will need the licence to be updated. This will show in the .tlf file as "Unmaintained since <year>". Follow WIBU Licence Update Request to get the licence updated.

Running very old TUFLOW releases

When using an old legacy model with the original TUFLOW release there might be an error "Could not find standalone or network dongle server". Such version of TUFLOW can only run with the old blue softlok licence dongle. If only the metal Wibu dongle is available, DB version of the same year release can be used. We do have some of the old dongles still in possession and can rent it out if required.
More information on TUFLOW licence dongles and which releases are affected: TUFLOW Licensing

No TUFLOW licence settings files found

Some users might mistake this sentence in the .tlf file as there is an issue with the licence. The real cause of the crash would be noted in the last couple of lines of the .tlf. The "No TUFLOW licence settings files found" sentence is followed by "Default settings applied" and the default settings are listed on the next three lines, e.g. WIBU Retry Time, WIBU Retry Count, WIBU Dongles Only. Only if these settings are required to be different, the licence control file (.lcf) would be created.

Technical licence issues

Wibu Dongles Troubleshooting