Running KFX and EXSIM on OneCompute

OneCompute is DNVs general purpose interface to cloud computing. Currently OneCompute supports Azure.

Azure Microsoft cloud computing platform, providing nearly infinitely Computing and Storage capabilities.

KFX™ is the best CFD-tool to solve all sorts of turbulent combustion.

Introduction

The purpose of this document is to provide KFX and EXSIM users the know-how to efficiently use OneCompute. Before we start let us remind ourself on the benefits of OneCompute.

Benefits of OneCompute

  • No need to oversize the local cluster for the big studies coming seldom.
  • Using OneCompute together with local cluster gives flexibility.
  • It may be a hazzle to move data back and fort, OneCompute attempts to minimize the hazzle.

Basic terminology

There are some terminology to introduce in order to start using OneCompute.

Client

The client program oc or oc.exe is the tool to interact with the computing resources.

It can be invoked as $KB/oc or simply oc if available in path. If no command is specified the client is started as a REPL, where commands are entered interactively. If a command is specified, that command is excecuted and oc exits. The latter may be useful to build scripts.

Project

To operate OneCompute with KFX and EXSIM files and folders are created on the local computer. The command Set-Project is used to configure that folder or switch between projects. Only one project may be operated at the time. Set-Project is used to switch between projects.

Job

A Job is a collection of KFX and EXSIM scenarios to be computed and administered as one unity. Create-Job JobA creates a job named jobA. Create-Job is also used to switch between jobs.

JobId

A JobId is a GUID to uniqely identify an instance of a job and is meant to be used by computers. An example of a JobId is 5679f60f-5856-463d-b0d0-3c3c1efc1505. Instead of using JobId in the commands we use JobRef.

JobRef - Job Reference

JobRef is named ProjectName-JobName-1 and is an alias for the GUID. The last digit in the JobRef is used to distinguish instances of the same job. When a job is started via the Start-Job command this instance is given an JobId and a JobRef. Both JobId and JobRef can be used in most commands.

Example from the Get-JobId command: The JobId of JobRef < Demo-jobA-8 > = < 5679f60f-5856-463d-b0d0-3c3c1efc1505 >

Worker

A virtual computer created specifically for one and only one simulation/case. It has a unique WorkerId (GUID). To reference a simulation we specify the case name instead of the GUID in the context of a JobRef.

First time setup of OneCompute for KFX and EXSIM

Follow the instructions below carefully. See an introduction video in the Tutorials section below.

Registration for OneCompute

In order to use OneCompute go to The OneCompute Portal and follow the instructions and accept agreement. Then let Your contact in DNV know that You need to be activated. The onboarding process is shown in the video below:

When registered and activated, the next step is to specify a local project folder which oc may use.

First time startup of oc on the computer

Open an KFX and EXSIM enabled terminal, for instance from the main start-up window of KFX, see figure below.

liccmd.png
Figure 1: The KFX command prompt
oc

oc will gives instructions with different ways to define the project folder. Pick one by using the Set-Project command.

------>Welcome to OneComputIT - Running KFX and EXSIM in the cloud<-----------
To get started please specify a folder to store cases, Jobs and results by running:
oc Set-Project <ProjectName> This will create a folder <ProjectName>
oc Set-Project <FullPathToFolder> The last folder will be the name of the Project
oc Set-Project without options will use current dir as ProjectFolder
If still unsure a good choice could be: oc Set-Project /home/COMPUTIT/rols/KFX-Work/OneCompute
OneCompute will then be the name of the Project

Set-Project (sp)

The Set-Project command will configure oc to use that folder, the folder is now the ProjectName.

Below, the full path to folder is used,

oc Set-Project /home/COMPUTIT/rols/ldev/OneComputePlantCFD-Documentation/TestProject

and it gave the following output:

ProjectName=TestProject
Store cases to run in folder /home/COMPUTIT/rols/ldev/OneComputePlantCFD-Documentation/TestProject/Cases
Jobs are stored locally /home/COMPUTIT/rols/ldev/OneComputePlantCFD-Documentation/TestProject/Jobs
Results are downloaded to    /home/COMPUTIT/rols/ldev/OneComputePlantCFD-Documentation/TestProject/Results
-----------------------------------------------------------
when _done adding cases continue with:
   Start-Case <casename>
or Create-job, Add-Cases, Upload-job, Start-Job, and when done Get-Results
-----------------------------------------------------------
Please use Get-Help, Get-Help <command>, Get-Info and Get-Alias frequently!
Happy OneComputing!!

Let us do what oc instructs us!

Get-Help

Rerunning oc to start the REPL, or as below running Get-Help

oc Get-Help

This is also a greeting output, that gives a quick overview of settings and frequently used commands.

--------------------------------------------------------------------
 ProjectName - OneCompute
 ProjectDir  - C:\Users\rols\KFX-WORK\OneCompute
 JobName     - T03
 CurrentDir  - C:\home\rols\ldev\Prosjekt\DebugPoolF
--------------------------------------------------------------------
The following commands may be used
  Set-Project          - [path] Set path as ProjectDir and init.
  Create-Job           - <Name> Create Job.                  
  Add-Cases            - [Cases Folder] Add Cases from Cases Folder to Job and prepare for upload.
  Upload-Job           - Upload job to cloud storage.        
  Start-Job            - [-d <days>] [-s] [-p <poolId>] Submit a new instance of job for execution..
  Stop-Job             - [JobRef] [-q] [Name] sends stop to (all) or specific worker.
  Get-Jobs             - [-All] [-JobId] Get a list of running Jobs (JobRefs).
  Get-Status           - [JobRef] [Name/-All] [-d/-t] Progress of workers in a JobRef.
  Send-Control         - [JobRef] <Name/-All> sends kfx Control command to worker.
  Sync-Results         - [JobRef] [-r] [Name] [filter] Download results (default filter="*.r3d *.kfx.gz").
  Get-Help             - Short help menu, option <command> detailed help about command.
  Get-FullHelp         - List All Commands.                  
  Get-Alias            - Display a list of alias for the commands.
  Quit                 - Quit the oc program.                
-----cmd=get-help -> _done -------- ? to Get-Help
<OneCompute-T03-13>|</OneCompute>

There are several details to notice, let us start with the command prompt.

Command Prompt

The REPL shows a command prompt as last line

<OneCompute-T03-13>|</OneCompute>

In the brackets the following key information is displayed.

<JobRef>|<Current folder in Cloud storage>|<Current local folder>

Most commands work in the context of current JobRef hence this is important.

Now we are ready to do our first calculations in OneCompute. Before we jump into that let us look at supported types of calculations.

Mandatory and optional parameters to a command

The commands follow a Verb-Noun pattern as in Powershell. The commands have <Mandatory> and [Optional] parameters. For instance, the command Create-Job have <Name> in angular brackets, and must be provided.

Supported scenarios and workflows

Hybrid

Imagine a computer with infinite cpu and storage. OneCompute transform the desktop into such a device. Scenarios are still defined on the desktop, but the actual calculations are submitted to OneCompute as if they are running on the desktop.

ocoffload.png
Figure 2: A dual offload of calculations by combining on-premise and OneCompute computing resources.

Cloud Desktop

By running a remote desktop in DNV Remote Services

It is actually possible to offload the entire desktop to a cloud based workflow. This is of interest if temporary needs of a GPU enabled desktop occurs, as is often the case when operating KFX and EXSIM.

remoteservices.png
Figure 3: Offloading the entire desktop to a cloud desktop.

One of the targets of KFX on OneCompute is to handle a lot of scenarios in a study.

Specifying input and output

When specifying input and output, avoid explicit path so that it is not location dependant. Then the following ways of specifying KFX and EXSIM scenarios are supported:

EXSIM

A case which is ready to compute with EXSIM. Any preprocessing and postprocessing should be done on the desktop.

KFX

fsc

A case specified with fsc may be excecuted, given:

  • One case in one folder
  • All required files to execute the case are in one folder
  • Only one fsc file in that folder.
  • No scn, json or wex file in that folder.

scn

Scenario files may be executed. Single case scenario files may use Start-Case, while multi case scenario files should use Create-Job, Add-Cases, Upload-Job and Start-Job.

json

OneCompute is ideal for Phast scenarios, these are specified in the json format.

xml files

xml files with spawn commands are supported for all types of KFX scenario specification.

Single scenario calculations

If there is just a single scenario to run, the calculation may be submitted to OneCompute by:

oc start-case CaseName

This will create a job with name CaseName with one scenario CaseName, and it will be uploaded and started in one go. Then it will start a job monitor displaying the output as the simulation progresses. This can serve as a quick way to start a simulation. If the job monitor is killed it does not stop the calculation.

Multiple scenario calculations

To unleash the real power of KFX on OneCompute, more than one case should be in one job. Depending on the complexity of the scenarios, the sweetspot is around 5 cases per job. First, copy the cases to run into the cases-folder of the project folder. Run command create-job JobName and add-cases to add all cases from cases-folder to the jobs/JobName folder. If the cases-folder contains 8 cases and only 5 is desired in this job, simply delete cases from the Jobs folder. Now run upload-job and start-job. While the job is running use the command Get-JobStatus to show status and get an estimated time to completion.

[Optional] vs <mandatory> parameters to a command

oc has a well thought through help system and most commands are self explanatory. A couple of nuances are worth knowing.

$KB/oc Get-Help Add-Cases

[Cases Folders] is written within [ ] this means that this is an optional argument. From the description we also see it that can take more options than specifying the Cases folder. If no option is given, it will use the Cases folder created when Set-Project command was given.

-----------------------------------------------------------------
  Add-Cases            - <Cases Folder> Add Cases from Cases Folder to Job and prepare for upload
  Alias: ac
-----------------------------------------------------------------
Description:
  scn files are parsed and split into separate cases
    -clean deletes all old cases in JobsFolder
    -r004 copies restart number 4 and set IFIL to 4
    -keepbm does not add -rm (remove) to bull2flux.
-----cmd=get-help -> _done -------- ? to Get-Help

While in the Create-Job command Name is given with < > this means that Name is mandatory.

$KB/oc Get-Help Create-Job
-----------------------------------------------------------------
  Create-Job           - [Name] Create Job
  Alias: change-job
  Alias: cj
-----------------------------------------------------------------
Description:
  Create a new Job of Name.

-----cmd=get-help -> _done -------- ? to Get-Help

Tutorials

Demo of running 4 Phast-KFX scenarios on OneCompute

This video covers

  • First time use
  • Registering in OneCompute
  • The OneCompute portal for costs and spending as well as administering resources
  • Start-Case running one scenario with detailed reporting
  • Setting up a Job with 3 scenarios, showing how the compute resources are enabled automatically
  • Get-JobStatus - Inspecting the progress
  • Get-Jobs - list of Jobs
  • Switch between Jobs

Howto or How do I?

oc has an excellent built in help system. This section focuses on given questions operators will ask themselves and how to solve them.

List commands or aliases

Apply command get-aliases and

Get-FullHelp

The complete help menu.

$KB/oc Get-FullHelp
-----------------------------------------------------------------
ProjectName - TestProject
ProjectDir  - /home/COMPUTIT/rols/ldev/OneComputePlantCFD-Documentation/TestProject
JobName     - Unknown
-----------------------------------------------------------------
The following commands may be used
  Add-Cases            - [Cases Folder] Add Cases from Cases Folder to Job and prepare for upload.
  Cancel-Job           - [JobRef] Abort all workers in the Job.
  Copy-File            - Copy a file/folder within cloud storage.
  Create-Job           - <Name> Create Job.                  
  Create-RestartJob    - [-r/-t] Creates a new Job from current JobRef to easily restart simulation.
  Delete-JobRef        - [-y] <JobRef*> Deletes JobRef matching option and corresponding Results.
  Download-Job         - Download current Job to local storage.
  Get-Alias            - Display a list of alias for the commands.
  Get-AzCopy           - Get a command to download with azcopy.
  Get-CaseChildItem    - <CaseName/-All> asks worker of CaseName to give a list of files in working dir.
  Get-ChildItem        - [-r] [dir] list files in blob storage.
  Get-Debug            - [JobId] Download debug log of either current job or with JobId.
  Get-File             - <fileInBlob> Download a file/folder from blob storage.
  Get-FullHelp         - List All Commands.                  
  Get-Help             - Short help menu, option <command> detailed help about command.
  Get-Info             - Get info about pool and cloud storage.
  Get-JobId            - [JobRef] Get JobId and WorkerIds of JobRef.
  Get-Jobs             - [-All] [-JobId] Get a list of running Jobs (JobRefs).
  Get-LocalChildItem   - [dir] list files in blob storage.   
  Get-MyBlob           - [-w] [-d #daysValid] [-n fileName] Get a read only SasToken.
  Get-Results          - [-r] [JobRef] [Name] [filter] Download results (default filter="*.r3d *.kfx.gz").
  Get-Status           - [JobRef] [Name/-All] [-d/-t] Progress of workers in a JobRef.
  Push-File            - <CaseName/-All> <fileName> ask worker to push file to blob.
  Quit                 - Quit the oc program.                
  Remove-Item          - [filter] Remove a file/folder in cloud storage.
  Send-Control         - [JobRef] <Name/-All> sends kfx Control command to worker.
  Send-File            - <CaseName/-All> Sends a file to the worker of CaseName..
  Set-JobRef           - <JobRef> Change the current JobRef. 
  Set-KFXVersion       - [KFXVersion] change KFXVersion to run.
  Set-LocalLocation    - Change current directory in local storage.
  Set-Location         - [-l] Change current directory in cloud storage or [-l] local.
  Set-MaxDaysOfJob     - Set the default number of days for a job to expire.
  Set-Platform         - [test/develop] [UserName] Change platform.onecompute.DNV.com.
  Set-Pool             - [poolId] change to another pool of computers.
  Set-Project          - [path] Set path as ProjectDir and init.
  Set-WorkerImage      - [workerImage] change Worker to run. 
  Show-Messages        - Read short messages from job..      
  Start-Agent          - Start an Agent.                     
  Start-Case           - <case> Prepare Submit and start case in one go.
  Start-Job            - [-d <days>] [-s] [-p <poolId>] Submit a new instance of job for execution..
  Start-JobLocal       - Create a new instance of job and execute locally.
  Start-JobMonitor     - Monitor a running job.              
  Stop-Job             - [JobRef] [-q] [Name] sends stop to (all) or specific worker.
  Sync-Results         - [JobRef] [-r] [Name] [filter] Download results (default filter="*.r3d *.kfx.gz").
  Unzip-File           - <fileName> [outDir] Unzip a file (Linux).
  Upload-Cases         - Add cases and Upload to job cloud storage.
  Upload-File          - Upload a file/folder to blob storage.
  Upload-Job           - Upload job to cloud storage.
-----cmd=get-fullhelp -> _done -------- ? to Get-Help

I have three simulations that I want run in a job

An example is given below for JobName=LiquidPool, which has three scenarios defined in json format. Follow the steps below:

Create-Job

$KB/oc Create-Job LiquidPool Sets JobName to LiquidPool, Creates LiquidPool on folder Jobs, prepared to be populated with scenarios to run.

Add-Cases

$KB/oc Add-Cases Copy cases/(scenarios) from the folder Cases to the Job LiquidPool to the folder Jobs. Prepared to be uploaded to BLOB-storage (cloud).

larsl@ubuvgpu2:~/Projects/Test$ ls Jobs/LiquidPool
N-DECANE_Rectangle_3_2_1.5_F.json  N-NONANE_Rectangle_5_2_1.5_F.json  N-OCTANE_Circle_3_1.5_F.json
larsl@ubuvgpu2:~/Projects/Test$ $KB/oc
--------------------------------------------------------------------
 ProjectName - Test
 ProjectDir  - /home/COMPUTIT/larsl/Projects/Test
 JobName     - LiquidPool
 CurrentDir  - /home/COMPUTIT/larsl/Projects/Test
--------------------------------------------------------------------
The following commands may be used
 Set-Project          - <path> Set path as ProjectDir and init.
 Start-Case           - Prepare Submit and start case in one go.
 Create-Job           - <Name> Create Job..                 
 Add-Cases            - Add Cases from Cases Folder to Job and prepare for upload.
 Upload-Job           - Upload job to cloud storage.        
 Start-Job            - Create a new instance of job and submit to the cloud for execution.
 Get-JobStatus        - <JobRef> either current job or JobRef.
 Get-Jobs             - Get a list of active Jobs.          
 Send-Control         - <CaseName> <CourantNumber> sends new courant number to worker of case.
 Get-CaseStatus       - ask worker to report status.        
 Stop-Case            - <case> <kill> sends stop/kill to worker of case.
 Get-Results          - Download all results for the Job from cloud storage.
 Get-Help             - <cmd> Detailed help about command.  
 Get-FullHelp         - List All Commands.                  
 Get-Alias            - List a list of alias for the commands.
 Quit                 - Quit the oc program.                
-----cmd=get-help -> _done -------- ? to Get-Help
<Test-LiquidPool-0>|</Test>

The Job <LiquidPool> is now ready to be uploaded to the BLOB-storage by the command Upload-Job

Upload-Job

<Test-LiquidPool-0>|</Test>
-- Uploading cases from /home/COMPUTIT/larsl/Projects/Test/Jobs/LiquidPool -> BlobStorage:/Test/Jobs/LiquidPool
Uploaded: N-OCTANE_Circle_3_1.5_F.json
Uploaded: N-DECANE_Rectangle_3_2_1.5_F.json
Uploaded: N-NONANE_Rectangle_5_2_1.5_F.json
-----cmd=upload-job -> _done -------- ? to Get-Help
<Test-LiquidPool-0>|</Test>

The job is loaded in a queue for a calculation node to pick by the command Start-Job

Start-Job

<Unknown>|</Testing> Start-Job
-----------------------------------------------------------------
Creating new Job Testing-LiquidPool-1 of Job LiquidPool with 3 cases to Compute
-----------------------------------------------------------------
Worker 75078ccf-5fde-4023-9c70-bc9314471bd4 will compute case NONANE_Circle_5_3_D.p2k
Worker f0c7e34a-d861-430c-88f4-421cc6b111cd will compute case OCTANE_Circle_4_1.5_E.p2k
Worker 8e27da8d-72ac-4cc9-bb22-991d51bf59b8 will compute case OCTANE_Rectangle_5_4_1.5_D.p2k
-----------------------------------------------------------------
Waiting for Job Testing-LiquidPool-1 to be registered
Waiting for Job Testing-LiquidPool-1 to be registered
Waiting for Job Testing-LiquidPool-1 to be registered
Submitted Job Testing-LiquidPool-1 jobId 9ecb7393-d2c2-4ee0-a9aa-ec72c011d1fc
Connecting to Job Testing-LiquidPool-1
-----cmd=start-job -> _done -------- ? to Get-Help

Different ways to display how far the Job has progressed

Get-Status

Gives short list of cases and status of Job on BLOB storage

<Testing-LiquidPool-1>|</Testing> Get-Job
 -------------------------------------------------------------------------------------------
| Job Reference                        |                                JobId |       Status|
| Testing-LiquidPool-1                       | 9ecb7393-d2c2-4ee0-a9aa-ec72c011d1fc |    Completed|
 -------------------------------------------------------------------------------------------
| Case Name                            |                             WorkerId |      Message|
| NONANE_Circle_5_3_D                  | 75078ccf-5fde-4023-9c70-bc9314471bd4 |             |
| OCTANE_Rectangle_5_4_1.5_D           | 8e27da8d-72ac-4cc9-bb22-991d51bf59b8 |             |
| OCTANE_Circle_4_1.5_E                | f0c7e34a-d861-430c-88f4-421cc6b111cd |             |
 -------------------------------------------------------------------------------------------
-----cmd=get-job -> _done -------- ? to Get-Help

I have run more than one Job

Get-Jobs

Gives list of active Jobs on calculation nodes. Options: -All ; list also non active jobs.

<Testing-LiquidPool-1>|</Testing>|</home/COMPUTIT/larsl/Projects/Testing/Jobs/LiquidPool>Get-Jobs
|---+------------------------+--------------------------------------+-----------------------+-----------+-----|
| - | JobRef                 | JobId                                | StartTime             | Status    |   % |
|---+------------------------+--------------------------------------+-----------------------+-----------+-----|
| 1 | One_Compute-LiquidPooltest-1 | c3c134b3-43f1-4fbd-96dd-ca58e756fb2a | 4/15/2021 8:36:02 AM  | Aborted   | 100 |
| 2 | One_Compute-LiquidPooltest-2 | a38b8998-361c-4b6d-8146-48635ca1e4c6 | 4/15/2021 8:40:28 AM  | Completed | 100 |
| 3 | Testing-LiquidPool-1         | 9ecb7393-d2c2-4ee0-a9aa-ec72c011d1fc | 4/15/2021 12:55:20 PM | Completed | 100 |
|---+------------------------+--------------------------------------+-----------------------+-----------+-----|

-----cmd=get-jobs -> _done -------- ? to Get-Help

Download the results of a Job

Try Get-Results. Results are downloaded from BLOB storage to local Results folder. The results are still stored at the BLOB storage.

Upload or Download one file

Try upload-file and get-file

Share my results with others

See Get-MyBlob and Get-AzCopy

Set-ActiveJob

Set-ActiveJob [JobRef] (from list of Jobs shown by the command Get-Jobs) Check that JobRef in the command prompt has changed.

Kill or Stop a running Job

Use Cancel-Job to kill a Job, no results are then stored. Use Stop-Case if a controlled stop is wanted, then all simulations are stopped and results are stored.

Control active simulations

Use Send-Control to change control variables for a single case or a Job. Example: oc Send-Control [CaseName] Coumax=15

Author: Robert Olsen and Lars A. Lilleeng

Created: 2024-10-02 Wed 10:03