Distributed OrcaFlex consists of three separate programs. A Distributed OrcaFlex client program runs on each machine that is to process OrcaFlex jobs (each client machine must have an OrcaFlex licence). One machine on the network runs the Distributed OrcaFlex server program that coordinates the list of OrcaFlex jobs and allocates these to the clients. Finally, a Distributed OrcaFlex viewer program that displays the list of jobs and their current status (e.g. pending, running, completed etc.) and allows jobs to be submitted and stopped. The availability and job capacity of each client can also be managed from the viewer program. The viewer and server programs do not use an OrcaFlex licence.
To minimise the impact on a user’s work the client program runs at a low operating system priority, this ensures that OrcaFlex jobs run in the background and give way to higher priority user tasks.
Downloading and Installing Distributed OrcaFlex
The latest version of Distributed OrcaFlex is 7.0a which can be installed by following these steps:
- Download DOF Manual.pdf which contains documentation for Distributed OrcaFlex including an installation guide.
- Download the following file: DistributedOrcaFlex.zip
- Unzip the contents, and run the extracted file DistributedOrcaFlex.msi.
Note: The minimum supported version of OrcaFlex is 10.0
Note: The client runs as a Windows service, but will not run using the ‘Local system’ account. During the installation you will be prompted for a set of credentials for the client service to run under. We recommend that you create a new user, for example ‘DOFUser’, that can then be used for all installations of the Distributed OrcaFlex client. This user should be created before you begin the installation and only be used for the Distributed OrcaFlex client service. This user must have the “Log on as a service” right and have rights to read and write to all areas of the network filing system that jobs may be submitted from including the location of OrcFxAPI.DLL and other required dll files. The “Log on as a service” right is normally set by group policy on the domain controller.
If you intend to run OrcaFlex models that use external functions or post calculation actions then an appropriate version of Python must be present on the client computer. This should be a 64 bit version of Python that is supported by the versions of OrcaFlex that will be used.
More details on installing Python for OrcaFlex can be found on the OrcaFlex Python API page.
- DOF can now handle simulation restarts. Restart parents are identified in the DOF Viewer when submitting a batch of jobs and the DOF Server will ensure that the dependency chain is run in the correct order if the ‘Respect restart sequence’ check box is selected when submitting jobs. When adding restart child models, you can automatically include the restart parents if you check the ‘Include restart parents’ check box. The child restart models must be ‘.yml’ text data files.
- The DOF Viewer can display full details for a selected job in a pop-up window from the context menu. Client machine details can be viewed in a pop-up from the context menu in the DOF Viewer OrcaFlex Clients tab. The DOF Server settings can be seen in a pop-up window by double-clicking the DOF Viewer status bar.
- The speed of handling of the job status and progress messages in the DOF Server has been improved.
- The list of DOF Clients kept by the DOF Server is now saved in structured text format (yaml) file. This can be edited to give a DOF Client an alias name for display in the DOF Viewer rather than its machine name.
- Bug Fix: If a client disconnected during communication with the DOF Server this would cause an error in the server.
- Bug Fix: A client could fail to provide a MAC address to the DOF Server which meant that the ‘Wake on LAN’ feature would not work.
- Bug Fix: If a DOF Client was disabled whilst running jobs, these jobs were given a ‘moving’ status but could block later scheduling to that DOF Client if other client machines were busy.
- Bug Fix: A bug in version 6.2a resulted in simulations not being able to resume from a paused simulation file. This situation occurs when a running simulation is manually paused from the DOF Viewer, or the simulation is auto-paused by DOF as a result of being moved between clients (for instance if the processor count is reduced on a DOF Client). The DOF Client saves an interim sim file that is reloaded when the job is continued on another machine. The bug resulted in the reload of this sim file being handled incorrectly and the simulation was then marked as ‘Completed’ with the message ‘no analysis performed’.
- Previously, when submitting jobs, you specified whether you wanted both statics and dynamics to be performed, or just statics. Starting in OrcaFlex 11.1 the input data specifies whether statics and/or dynamics are to be performed. Because this setting has been moved into the model data, it is no longer possible to make this choice when submitting jobs in Distributed OrcaFlex. However, it is sometimes useful to be able to skip dynamics even if it is enabled in the input data. A new skip dynamics option has been added to the submit job dialog which mirrors the identically named option on the OrcaFlex batch processing form.
- The console program (dofcmd.exe) can now list jobs in both yaml and csv format. Previously only csv format was available.
- Jobs with post calculation actions would sometimes be restarted by the server if the post calculation action spent too long processing without sending a progress update. This release fixes that issue, but also requires that the Python OrcFxAPI module is updated.
- This release adds support for the new FlexNet Publisher licence system introduced in OrcaFlex version 11.0a. This requires additional DLLs to OrcFxAPI.dll to be available in your network folder structure. Without these additional DLLs, attempting to run models using OrcaFlex 11.0a or later will result in errors.
- There have been some minor operational changes designed to reduce incidents of jobs becoming stuck in a scheduled state but not running.
- When a DOF Client disconnects from the DOF Server it is now listed in the DOF Viewer as ‘Disconnected’ rather than ‘Sleeping’ or ‘Unavailable’ since the DOF Server has no knowledge of the actual reason for disconnection.
- Wake-On-LAN is now disabled by default, since the WOL feature is usually disabled by default in the client computer BIOS.
- A new Orcina logo.
- Adds a small delay between starting DOF Client processes running on the same machine (computers with large core counts may start more than one DOF Client). This is to allow more time for the DOF Server to add each client as it connects.
- Supports a new external function attribute (‘CanResumeSimulation’) due in OrcaFlex 10.3b to identify functions that do not save their processing state correctly (often they are using code provided by a third party). This means the simulation cannot resume from a partially run state and consequently DOF will not auto-save the model. If DOF is required to pause or move one of these models then the DOF Client will ignore this and continue running the simulation. This only applies to models run with OrcaFlex DLL version 10.3b or later, with earlier DLL versions the simulation will be paused, moved or auto-saved as normal, but it will not resume correctly if using such an external function.
- The DOF Server default settings for writing the job list and job log files have been changed to not write these files, see the DOF Manual for more details.
- Bug fix: Sometimes, in the event of an error, the DOF Server would produce a cascade of error reports that made the DOF Server unresponsive for a while. This is now resolved, the DOF Server reports all errors to the DOFServer.log without generating any further error files.
- Bug fix: When using dofcmd to submit jobs, an Autosave interval of 0 was not allowed when this is in fact a valid interval used to disable autosaving.
- Bug fix: If the DOF Server is restarted while some jobs are still running on clients, then those jobs could end up being cancelled rather than re-added to the job list to continue as normal.
- Bug fix: If jobs were submitted whilst the DOF Server was already distributing jobs to DOF Clients then the scheduler ramping feature was re-initiated. Now, the ramping feature only starts if the jobs are added when the DOF Server is idle.
- In the client list view of the DOF Viewer, the list columns are now resizeable.
- Bug fix: At startup, a DOF Client machine running multiple client processes would appear in the DOF Viewer with a low processor count rather than the true total for the machine. A related problem was that setting the client’s processor count to ‘None’ through the DOF Viewer had no effect.
- Bug fix: If an error occurred in the DOF Server caused by a repeating problem (such as a communication error), then multiple error reports were created that could fill the C:\ProgramData\Orcina\DOF directory, and prevent the DOF Server from responding to the DOF Clients and Viewer.
- Bug fix: When adding small job batches, these may be all scheduled and queued locally to a small number of DOF Clients, leaving other clients idle that should be sharing the processing.
- The major change in this release is the ability to have more than one DOF Client running on the same physical computer. This enables Distributed OrcaFlex to utilise properly all the processor cores on a computer that has processor groups, generally large capacity servers. A DOF Client process starts per processor group to give full utilization, and optionally the number of DOF Clients can be set higher than this. This will also benefit models using Python external functions or post-calculation actions as there will now be a Python interpreter per DOF Client process, reducing the impact of the processing bottleneck the Python Interpreter introduces.
- Jobs can now be manually paused and resumed from the DOF Viewer. A paused job will remain so until resumed by the user from the DOF Viewer.
- DOF Server functions that automatically move jobs between clients have been removed. This includes forcibly pausing and moving one user’s jobs to make way for another, and moving jobs from slower to faster computers towards the end of a batch run. In the previous version of DOF these functions were disabled by default. These functions offered only limited benefits and in some cases unnecessarily moved jobs. Removing the functions allows for a more streamlined server. The new manual pause and resume feature can be used to achieve the same ends.
- You can optionally choose to set up the DOF Server to operate as a straightforward batch processor (using a registry setting). Processors are not shared between users, instead jobs are run in the order they are submitted to DOF.
- Each DOF Client has a small queue for buffering pending jobs sent by the DOF server. This reduces the time between finishing one job and starting another, particularly beneficial for shorter jobs.
- Processing of new job batches is ramped up slowly (over about 2 mins). This smooths the job throughput by preventing a spike of file server activity when jobs start or finish at the same time. This is enabled by default but can be disabled using a registry setting.