fovus job create¶

Upload files to Fovus and create a new job.

Creates .fovus folder inside JOB_DIRECTORY, which contains data about the job and enables checking job status and downloading job files using the JOB_DIRECTORY.

JOB_CONFIG_FILE_PATH is the file path to a Fovus job config JSON. Values given in this file will be used unless they are overridden by CLI input. The job config JSON must follow the structure given in the provided job config templates, which are generated when the fovus config open command is used and will be located in ~/.fovus/job_configs.

JOB_DIRECTORY is the root directory of the job folder. A job folder contains one or multiple task folders. Each folder uploaded under the job folder will be considered a task of the job. Each task folder is a self-contained folder containing the necessary input files and scripts to run the task. For a job that has N tasks (e.g. a DOE job with N simulation tasks), N task folders must exist under the job folder and be uploaded.

fovus job create [OPTIONS] JOB_CONFIG_FILE_PATH JOB_DIRECTORY

Options

--include-paths <include_paths>¶

The relative paths to files or folders inside the JOB_DIRECTORY that will be uploaded. Paths are provided with support for Unix shell-style wildcards.

You can only provide either –include-paths or –exclude-paths, not both.

Supported wildcards: * - matches any number of characters ? - matches any single character

E.g. taskName/out?/*.txt matches any .txt file in folders taskName/out1, taskName/out2, etc.

E.g. taskName???/folder/file.txt matches taskName001/folder/file.txt, taskName123/folder/file.txt, etc.

To specify multiple paths, this option may be provided multiple times or deliminated by a comma (,). To escape a comma, use two commas (,,).

E.g. –include-paths “path1” –include-paths “path2”

E.g. –include-paths “path1,path2”

--exclude-paths <exclude_paths>¶

The relative paths to files or folders inside the JOB_DIRECTORY that will not be uploaded. Paths are provided with support for Unix shell-style wildcards.

You can only provide either –include-paths or –exclude-paths, not both.

Supported wildcards: * - matches any number of characters ? - matches any single character

E.g. taskName/out?/*.txt matches any .txt file in folders taskName/out1, taskName/out2, etc.

E.g. taskName???/folder/file.txt matches taskName001/folder/file.txt, taskName123/folder/file.txt, etc.

To specify multiple paths, this option may be provided multiple times or deliminated by a comma (,). To escape a comma, use two commas (,,).

E.g. –exclude-paths “path1” –exclude-paths “path2”

E.g. –exclude-paths “path1,path2”

--project-name <project_name>¶: The project name associated with your job. If omitted, the default project will be used. Use ‘None’ to specify no project.

--email-notification, --no-email-notification¶: Enable or disable email notification when the job is finished. By default, your notification preference setting will be applied.

--debug-mode¶: Keep compute nodes alive after each task execution until the task walltimeHours is reached to allow additional time for interactive debugging via SSH.

--auto-delete-days <autoDeleteDays>¶: Set a number of days after which the job will be permanently deleted. The deletion timer starts immediately after the job completes, fails, or is terminated

--benchmarking-profile-name <benchmarkingProfileName>¶: Fovus will optimize the cloud strategies for your job execution, including determining the optimal choices of virtual HPC infrastructure and computation parallelism, if applicable, based upon the selected benchmarking profile. For the best optimization results, select the benchmarking profile whose characteristics best resemble the workload under submission. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--computing-device <computingDevice>¶

The target computing device(s) for running your workload. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

Options:: cpu | cpu+gpu

--docker-hub-password <dockerHubPassword>¶: The password for the Docker Hub account that shall be used to pull the ‘imagePath’ for containerized environments. If this flag is provided, then –docker-hub-username must be provided as well.

--docker-hub-username <dockerHubUsername>¶: The username for the Docker Hub account that shall be used to pull the ‘imagePath’ for containerized environments. If this flag is provided, then –docker-hub-password must be provided as well.

--job-name <job_name>¶: The name of the job to be created. If a name is not provided, the job ID will be used.

--is-single-threaded-task <isSingleThreadedTask>¶: Set true if and only if each task uses only a single CPU thread (vCPU) at a maximum. Setting it true allows multiple tasks to be deployed onto the same compute node to maximize the task-level parallelism and the utilization of all CPU threads (vCPUs).

--license-timeout-hours <licenseTimeoutHours>¶: For license-required jobs, the maximum time the job is allowed to be waiting in a queue for deployment when no license is available. A job will be terminated once the timeout timer expires. Not applicable to license-free jobs. Format: Real (e.g., 1.5). Range: ≥1. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--monolithic-override <monolithicOverride>¶: Override a monolithic software environment. Provide VENDOR_NAME SOFTWARE_NAME LICENSE_FEATURE NEW_LICENSE_COUNT. All four values are required to reference a monolithic software and its license usage constraint. Currently, the only supported override for a monolithic software environment is the license count, and as a result, only the license count is overridden. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--min-gpu <minGpu>¶: The minimum number of GPUs required to parallelize the execution of each task. Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--max-gpu <maxGpu>¶: The maximum number of GPUs allowed to parallelize the execution of each task. Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--min-gpu-mem-gib <minGpuMemGiB>¶: The minimum total size of GPU memory required to support the execution of each task, summing the required memory size for each GPU. Format: Real (e.g., 10.5). Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--min-vcpu <minvCpu>¶: The minimum number of vCPUs required to parallelize the execution of each task (or multiple single-threaded tasks on each compute node if isSingleThreadedTask is true). A vCPU refers to a thread of a CPU core. Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--job-max-cluster-size-vcpu <jobMaxClusterSizevCpu>¶: The maximum cluster size in terms of the total number of vCPUs allowed for parallelizing task runs in the job, which only takes into effect when isSingleThreadedTask is true. A default value of 0 means no limit.

--max-vcpu <maxvCpu>¶: The maximum number of vCPUs allowed to parallelize the execution of each task (or multiple single-threaded tasks on each compute node if isSingleThreadedTask is true). A vCPU refers to a thread of a CPU core. Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--min-vcpu-mem-gib <minvCpuMemGiB>¶: The minimum total size of system memory required to support the execution of each task (or multiple single-threaded tasks on each compute node if isSingleThreadedTask is true), summing the required memory size for each vCPU. Format: Real (e.g., 10.5). Only values supported by the selected BP are allowed. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--output-file-list <outputFileList>¶

Specify the output files to include or exclude from transferring back to the cloud storage using relative paths from the working directory of each task.

Supported wildcards: * - matches any number of characters ? - matches any single character

E.g. out?/*.txt matches any .txt file in folders out1, out2, etc.

E.g. folder???/file.txt matches folder001/file.txt, folder123/file.txt, etc.nn: This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--output-file-option <outputFileOption>¶

Specify whether the output files in outputFileList should be included or excluded from transferring back to the cloud storage after the job is completed. See outputFileList for more information. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

Options:: include | exclude

--remote-inputs <remoteInputsForAllTasks>¶

Provide the URL of or path to any file or folder in Fovus Storage. The files and folders specified will be included under the working directory of each task as inputs for all tasks and will be excluded from syncing back to Fovus Storage as job files. If the file or folder is from My Files of Fovus Storage, a short relative path works the same as the URL. The path to a folder must end with “/”. This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

E.g. “folderName/fileName.txt” is equivalent to “https://app.fovus.co/files?path=folderName/fileName.txt”

E.g. “folderName/” is equivalent to “https://app.fovus.co/folders?path=folderName””

--parallelism-config-files <parallelismConfigFiles>¶

Specify the configuration files that contain Fovus Environment Tokens, if any, using relative paths from the working directory of each task. All Fovus Environment Tokens in the configuration files specified will be resolved to values prior to task execution. Supported wildcards: * - matches any number of characters ? - matches any single character

E.g. out?/*.txt matches any .txt file in folders out1, out2, etc.

E.g. folder???/file.txt matches folder001/file.txt, folder123/file.txt, etc.nn: This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--parallelism-optimization <parallelismOptimization>¶: If enabled, Fovus will determine the optimal parallelism for parallelizing the computation of each task to minimize the total runtime and cost based on the time-to-cost priority ratio (TCPR) specified. To pass in the optimal parallelism to your software program, you can directly use the Fovus environment tokens, e.g., $FovusOptVcpu, $FovusOptGpu, in your command lines or or in the input configuration files specified by the parallelismConfigFiles job config field. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--run-command <runCommand>¶: Specify the command lines to launch each task. The same command lines will be executed under the working directory of each task. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--scalable-parallelism <scalableParallelism>¶: A software program exhibits scalable parallelism if it can make use of more computing devices (e.g., vCPUs and/or GPUs) to parallelize a larger computation task. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--scheduled-at <scheduledAt>¶: The time at which the job is scheduled to be submitted. The default time zone is your local time zone. Acceptable formats include the following. 1) ISO 8601 format: “YYYY-MM-DDThh:mm:ss[.mmm]TZD” (e.g., “2020-01-01T18:30:00-05:00”). 2) Date only (defaults to upcoming 12AM your local time zone): “YYYY-MM-DD” (e.g., “2020-01-01”). 3) Time only (defaults to next upcoming time your local time zone): “hh:mmTZD” or “hh:mm” or “hh:mm AM/PM/am/pm” (e.g., “18:30-05:00” or “18:30” or “6:30 PM”). 4) Natural language time: “DD month YYYY HH:MM AM/PM/am/pm timezone” (e.g., “21 July 2013 10:15 pm PDT”).

--storage-gib <storageGiB>¶: The total size of local SSD storage required to support the execution of each task (or multiple single-threaded tasks on each compute node if isSingleThreadedTask is true). No need to include any storage space for the operating system. This is only for task storage. Format: Integer. Range: [1, 65536]. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--is-hybrid-strategy-allowed <isHybridStrategyAllowed>¶: This option further enhances infrastructure scalability beyond multi-cloud-zone/region auto-scaling. If true, more HPC strategies than the most optimal one will be allowed in infrastructure provisioning to maximize tasks running in parallel according to resource availability. In case the optimal HPC strategy has insufficient resource availability across the applicable cloud zones and regions at the time, the 2nd optimal strategy will be added to availability searching and infrastructure provisioning, and then the 3rd so on and so forth, until all allowed tasks are running in parallel. The allowed HPC strategies are subject to default marginal time and cost constraints with respect to the optimal one to limit the potential time and cost inflation. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--allow-preemptible <allowPreemptible>¶: Preemptible resources are subject to reclaim by cloud service providers, resulting in the possibility of interruption to running tasks. Interrupted tasks will be automatically re-queued and retried until completion. When enabled, cloud strategy optimization will take into account both on-demand and preemptible-based HPC strategies, using statistical models to analyze expected runtime and costs considering interruption probability, benchmarking performance, and pricing dynamics in real time. Preemptible resources will be prioritized if they are deemed optimal for your workload. When “Allow hybrid strategy” is also enabled, both preemptible and on-demand-based HPC strategies may be leveraged according to their optimality and resource availability. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--is-resumable-workload <isResumableWorkload>¶: Indicates if the workload can save work in progress and resume from a saved session or checkpoint upon re-execution. This information allows cloud strategy optimization to better estimate runtime and costs when using preemptible resources, minimizing the impact of interruptions. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--is-subject-to-license-availability <isSubjectToLicenseAvailability>¶: When enabled, Fovus will attempt to auto-switch the base license feature in case the specified one is unavailable or auto-adjust the maximum vCPU constraint in case the HPC license is insufficient to avoid license wait time. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--enable-hyperthreading <enableHyperthreading>¶: For the CPUs that support hyperthreading, enabling hyperthreading allows two threads (vCPUs) to run concurrently on a single CPU core. For HPC workloads, disabling hyperthreading may potentially result in performance benifits with respect to the same CPU cores (e.g., 32 threads - 32 cores with hyperthreading disabled v.s. 64 threads - 32 cores with hyperthreading enabled), whereas enabling hyperthreading may potentially result in cost benifits with respect to the same parallelism (e.g., 64 threads - 32 cores with hyperthreading enabled v.s. 64 threads - 64 cores with hyperthreading disabled). (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--supported-cpu-architectures <supportedCpuArchitectures>¶

The CPU architecture(s) compatible with your workload. Running your workload on an incompatible CPU architecture may result in a failed job. This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

Options:: x86-64 | arm-64

--time-to-cost-priority-ration <timeToCostPriorityRatio>¶: Fovus will optimize the cloud strategies for your job execution to minimize the total runtime and cost based on the time-to-cost priority ratio (TCPR) specified below. TCPR defines the weights (amount of importance) to be placed on time minimization over cost minimization on a relative scale. In particular, a ratio of 1/0 or 0/1 will enforce cloud strategies to pursue the minimum achievable runtime or cost without consideration of cost or runtime, respectively. Format must be “num1/num2” where num1 + num2 = 1, 0 <= num1 <= 1, and 0 <= num2 <= 1. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--license-consumption-profile <licenseConsumptionProfile>¶: licenseConsumptionProfile defines the pattern of license draw based on running conditions, such as vCPU or GPU parallelism. When an LCP is specified, the license consumption constraints for queue and auto-scaling will be automatically extracted based on the running conditions defined by the optimal cloud strategy. licenseConsumptionProfile has a higher precedence than licenseCountPerTask.(Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--walltime-hours <walltimeHours>¶: The maximum time each task is allowed to run. A task will be terminated without a condition once the walltime timer expires. Format: Real (e.g., 1.5). Range: >0. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--post-processing-walltime-hours <postProcessingWalltimeHours>¶: The maximum time each task is allowed to run Post Processing Task. A task will be terminated without a condition once the walltime timer expires. Format: Real (e.g., 1.5). Range: >0. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--post-processing-storage-gib <postProcessingStorageGiB>¶: The total size of local SSD storage required to support the execution of post processing task. No need to include any storage space for the operating system. This is only for task storage. Format: Integer. Range: [1, 65536]. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--post-processing-run-command <postProcessingRunCommand>¶: Specify the command lines to launch post processing task. The same command lines will be executed under the working directory of post processing task. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--post-processing-task-name <postProcessingTaskName>¶: Specify the folder name of post processing task. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--search-output-keywords <keywords>¶: Specify the keywords to be used during the keyword search at the end of each task. The ‘AND’ logic is applied when multiple keywords are provided. This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--search-output-files <targetOutputFiles>¶

The desire output files to be used during the keyword search.

Supported wildcards: * - matches any number of characters ? - matches any single character

E.g. out?/*.txt matches any .txt file in folders out1, out2, etc.

E.g. folder???/file.txt matches folder001/file.txt, folder123/file.txt, etc.nn: This option may be provided multiple times. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--is-retry-if-matched-enabled <isRetryIfMatchedEnabled>¶: If set to true, the task will be requeued for re-execution when the specified keywords are matched. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--max-retry-attempts <maxRetryAttempts>¶: Specifies the maximum retry limit. Once this limit is reached, the task will no longer be retried. (Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

--job-id <job_id>¶: Specifies the pre generated jobId.(Note: Used for overriding job config values. If this value is provided in the job config file, this argument is optional.)

Arguments

JOB_CONFIG_FILE_PATH¶: Required argument

JOB_DIRECTORY¶: Required argument