Command: run

Usage

guild run [OPTIONS] [[MODEL:]OPERATION] [FLAG=VAL...]

Run an operation.

By default Guild tries to run OPERATION for the default model defined in the current project.

If MODEL is specified, Guild uses it instead of the default model.

OPERATION may alternatively be a Python script. In this case any current project is ignored and the script is run directly. Options in the format --NAME=VAL can be passed to the script using flags (see below).

[MODEL]:OPERATION may be omitted if --restart or --proto is specified, in which case the operation used in RUN is used.

Specify FLAG values in the form FLAG=VAL.

Batch Files

One or more batch files can be used to run multiple trials by specifying the file path as @PATH.

For example, to run trials specified in a CSV file named trials.csv, run:

guild run [MODEL:]OPERATION @trials.csv

NOTE: At this time you must specify the operation with batch files - batch files only contain flag values and cannot be used to run different operations for the same command.

Batch files may be formatted as CSV, JSON, or YAML. Format is determined by the file extension.

Each entry in the file is used as a set of flags for a trial run.

CSV files must have a header row containing the flag names. Each subsequent row is a corresponding list of flag values that Guild uses for a generated trial.

JSON and YAML files must contain a top-level list of flag-to-value maps.

Use --print-trials to preview the trials run for the specified batch files.

Flag Lists

A list of flag values may be specified using the syntax [VAL1[,VAL2]...]. Lists containing white space must be quoted. When a list of values is provided, Guild generates a trial run for each value. When multiple flags have list values, Guild generates the cartesian product of all possible flag combinations.

Flag lists may be used to perform grid search operations.

For example, the following generates four runs for operation train and flags learning-rate and batch-size:

guild run train learning-rate[0.01,0.1] batch-size=[10,100]

You can preview the trials generated from flag lists using --print-trials. You can save the generated trials to a batch file using --save-trials. For more information, see PREVIEWING AND SAVING TRIALS below.

When --optimizer is specified, flag lists may take on different meaning depending on the type of optimizer. For example, the random optimizer randomly selects values from a flag list, rather than generate trials for each value. See OPTIMIZERS for more information.

Optimizers

A run may be optimized using --optimizer. An optimizer runs up to --max-trials runs using flag values and flag configuration.

For details on available optimizers and their behavior, refer to Optimizers Reference - Reference - Guild AI.

Limit Trials

When using flag lists or optimizers, which generate trials, you can limit the number of trials with --max-trials. By default, Guild limits the number of generated trials to 20.

Guild limits trials by randomly sampling the maximum number from the total list of generated files. You can specify the seed used for the random sample with --random-seed. The random seed is guaranteed to generate consistent results when used on the same version of Python. When used across different versions of Python, the results may be inconsistent.

Preview or Save Trials

When flag lists (used for grid search) or an optimizer is used, you can preview the generated trials using --print-trials. You can save the generated trials as a CSV batch file using --save-trials.

Start an Operation Using a Prototype Run

If --proto is specified, Guild applies the operation, flags, and source code used in RUN to the new operation. You may add or redefine flags in the new operation. You may use an alternative operation, in which case only the flag values and source code from RUN are applied. RUN must be a run ID or unique run ID prefix.

Restart an Operation

If --restart is specified, RUN is restarted using its operation and flags. Unlike --proto, restart does not create a new run. You cannot change the operation, flags, source code, or run directory when restarting a run.

Staging an Operation

Use --stage to stage an operation to be run later. Use --start with the staged run ID to start it.

If --start is specified, RUN is started using the same rules applied to --restart (see above).

Alternate Run Directory

To run an operation outside of Guild’s run management facility, use --run-dir or --stage-dir to specify an alternative run directory. These options are useful when developing or debugging an operation. Use --stage-dir to prepare a run directory for an operation without running the operation itself. This is useful when you want to verify run directory layout or manually run an operation in a prepared directory.

NOTE: Runs started with --run-dir are not visible to Guild and do not appear in run listings.

Control Visible GPUs

By default, operations have access to all available GPU devices. To limit the GPU devices available to a run, use --gpus.

For example, to limit visible GPU devices to 0 and 1, run:

guild run --gpus 0,1 ...

To disable all available GPUs, use --no-gpus.

NOTE: --gpus and --no-gpus are used to construct the CUDA_VISIBLE_DEVICES environment variable used for the run process. If CUDA_VISIBLE_DEVICES is set, using either of these options redefines that environment variable for the run.

Optimize Runs

Use --optimizer to run the operation multiple times in attempt to optimize a result. Use --minimize or --maximize to indicate what should be optimized. Use --max-runs to indicate the maximum number of runs the optimizer should generate.

Edit Flags

Use --edit-flags to use an editor to review and modify flag values. Guild uses the editor defined in VISUAL or EDITOR environment variables. If neither environment variable is defined, Guild uses an editor suitable for the current platform.

Debug Source Code

Use --debug-sourcecode to specify the location of project source code for debugging. Guild uses this path instead of the location of the copied soure code for the run. For example, when debugging project files, use this option to ensure that modules are loaded from the project location rather than the run directory.

Breakpoints

Use --break to set breakpoints for Python based operations. LOCATION may be specified as [FILENAME:]LINE or as MODULE.FUNCTION.

If FILENAME is not specified, the main module is assumed. Use the value 1 to break at the start of the main module (line 1).

Relative file names are resolved relative to the their location in the Python system path. You can omit the .py extension.

If a line number does not correspond to a valid breakpoint, Guild attempts to set a breakpoint on the next valid breakpoint line in the applicable module.

Options

-l, --label LABEL Set a label for the run.
-t, --tag TAG Associate TAG with run. May be used multiple times.
-c, --comment COMMENT Comment associated with the run.
-ec, --edit-comment Use an editor to type a comment.
-e, --edit-flags Use an editor to review and modify flags.
-d, --run-dir DIR Use alternative run directory DIR. Cannot be used with --stage.
--stage Stage an operation.
--start, --restart RUN Start a staged run or restart an existing run. Cannot be used with --proto or --run-dir.
--proto RUN Use the operation, flags and source code from RUN. Flags may be added or redefined in this operation. Cannot be used with --restart.
--force-sourcecode Use working source code when --restart or --proto is specified. Ignored otherwise.
--gpus DEVICES Limit availabe GPUs to DEVICES, a comma separated list of device IDs. By default all GPUs are available. Cannot beused with --no-gpus.
--no-gpus Disable GPUs for run. Cannot be used with --gpus.
-bl, --batch-label LABEL Label to use for batch runs. Ignored for non-batch runs.
-bt, --batch-tag TAG Associate TAG with batch. Ignored for non-batch runs. May be used multiple times.
-bc, --batch-comment COMMENT Comment associated with batch.
-ebc, --edit-batch-comment Use an editor to type a batch comment.
-o, --optimizer ALGORITHM Optimize the run using the specified algorithm. See Optimizing Runs for more information.
-O, --optimize Optimize the run using the default optimizer.
-N, --minimize COLUMN Column to minimize when running with an optimizer. See help for compare command for details specifying a column. May not be used with --maximize.
-X, --maximize COLUMN Column to maximize when running with an optimizer. See help for compare command for details specifying a column. May not be used with --minimize.
-Fo, --opt-flag FLAG=VAL Flag for OPTIMIZER. May be used multiple times.
-m, --max-trials, --trials N Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20.
--random-seed N Random seed used when sampling trials or flag values.
--debug-sourcecode PATH Specify an alternative source code path for debugging. See Debug Source Code below for details.
--stage-trials For batch operations, stage trials without running them.
-r, --remote REMOTE Run the operation remotely.
-y, --yes Do not prompt before running operation.
-f, --force-flags Accept all flag assignments, even for undefined or invalid values.
--force-deps Continue even when a required resource is not resolved.
--stop-after N Stop operation after N minutes.
--fail-on-trial-error Stop batch operations when a trial exits with an error.
--needed Run only if there is not an available matching run. A matching run is of the same operation with the same flag values that is not stopped due to an error.
-b, --background Run operation in background.
--pidfile PIDFILE Run operation in background, writing the background process ID to PIDFILE.
-n, --no-wait Don’t wait for a remote operation to complete. Ignored if run is local.
--save-trials PATH Saves generated trials to a CSV batch file. See BATCH FILES for more information.
--keep-run Keep run even when configured with ‘delete-on-success’.
--keep-batch Keep batch run rather than delete it on success.
-D, --dep PATH Include PATH as a dependency.
--break LOCATION Set a breakpoint at the specified location for Python based operations. Set LOCATION to 1 to break at line 1 of the main module. See Breakpoints above for LOCATION format. Use multiple times for more than one breakpoint.
--break-on-error Enter the Python debugger at the point an error occurs for Python based operations.
-q, --quiet Do not show output.
--print-cmd Show operation command and exit.
--print-env Show operation environment and exit.
--print-trials Show generated trials and exit.
--help-model Show model help and exit.
-h, --help-op Show operation help and exit.
--test-output-scalars OUTPUT Test output scalars on output. Use ‘-’ to read from standard intput.
--test-sourcecode Test source code selection.
--test-flags Test flag configuration.
--help Show this message and exit.

Guild AI version 0.9.0

2 posts were split to a new topic: EDITOR with VS Code not working on Windows