Experiment Config Reference¶

Notes¶

This document list field names is camelCase. They need to be converted to snake_case for Python library nni.experiment.
In this document type of fields are formatted as Python type hint. Therefore JSON objects are called dict and arrays are called list.

Some fields take a path to file or directory. Unless otherwise noted, both absolute path and relative path are supported, and ~ will be expanded to home directory.
- When written in YAML file, relative paths are relative to the directory containing that file.
- When assigned in Python code, relative paths are relative to current working directory.
- All relative paths are converted to absolute when loading YAML file into Python class, and when saving Python class to YAML file.
Setting a field to None or null is equivalent to not setting the field.

Examples¶

Local Mode¶

experimentName: MNIST
searchSpaceFile: search_space.json
trialCommand: python mnist.py
trialCodeDirectory: .
trialGpuNumber: 1
maxExperimentDuration: 24h
maxTrialNumber: 100
tuner:
  name: TPE
  classArgs:
    optimize_mode: maximize
trainingService:
  platform: local
  useActiveGpu: True

Local Mode (Inline Search Space)¶

searchSpace:
  batch_size:
    _type: choice
    _value: [16, 32, 64]
  learning_rate:
    _type: loguniform
    _value: [0.0001, 0.1]
trialCommand: python mnist.py
trialGpuNumber: 1
tuner:
  name: TPE
  classArgs:
    optimize_mode: maximize
trainingService:
  platform: local
  useActiveGpu: True

Remote Mode¶

experimentName: MNIST
searchSpaceFile: search_space.json
trialCommand: python mnist.py
trialCodeDirectory: .
trialGpuNumber: 1
maxExperimentDuration: 24h
maxTrialNumber: 100
tuner:
  name: TPE
  classArgs:
    optimize_mode: maximize
trainingService:
  platform: remote
  machineList:
    - host: 11.22.33.44
      user: alice
      password: xxxxx
    - host: my.domain.com
      user: bob
      sshKeyFile: ~/.ssh/id_rsa

Reference¶

ExperimentConfig¶

experimentName¶

Mnemonic name of the experiment. This will be shown in web UI and nnictl.

type: Optional[str]

searchSpaceFile¶

Path to a JSON file containing the search space.

type: Optional[str]

Search space format is determined by tuner. Common format for built-in tuners is documeted here.

Mutually exclusive to searchSpace.

searchSpace¶

Search space object.

type: Optional[JSON]

The format is determined by tuner. Common format for built-in tuners is documented here.

Note that None means “no such field” so empty search space should be written as {}.

Mutually exclusive to searchSpaceFile.

trialCommand¶

Command to launch trial.

type: str

The command will be executed in bash on Linux and macOS, and in PowerShell on Windows.

trialCodeDirectory¶

Path to the directory containing trial source files.

type: str

default: "."

All files in this directory will be sent to training machine, unless there is a .nniignore file. (See nniignore section of quick start guide for details.)

trialConcurrency¶

Specify how many trials should be run concurrently.

type: int

The real concurrency also depends on hardware resources and may be less than this value.

trialGpuNumber¶

Number of GPUs used by each trial.

type: Optional[int]

This field might have slightly different meaning for various training services, especially when set to 0 or None. See training service’s document for details.

In local mode, setting the field to zero will prevent trials from accessing GPU (by empty CUDA_VISIBLE_DEVICES). And when set to None, trials will be created and scheduled as if they did not use GPU, but they can still use all GPU resources if they want.

maxExperimentDuration¶

Limit the duration of this experiment if specified.

type: Optional[str]

format: number + s|m|h|d

examples: "10m", "0.5h"

When time runs out, the experiment will stop creating trials but continue to serve web UI.

maxTrialNumber¶

Limit the number of trials to create if specified.

type: Optional[int]

When the budget runs out, the experiment will stop creating trials but continue to serve web UI.

nniManagerIp¶

IP of current machine, used by training machines to access NNI manager. Not used in local mode.

type: Optional[str]

If not specified, IPv4 address of eth0 will be used.

Must be set on Windows and systems using predictable network interface name, except for local mode.

useAnnotation¶

Enable annotation.

type: bool

default: False

When using annotation, searchSpace and searchSpaceFile should not be specified manually.

debug¶

Enable debug mode.

type: bool

default: False

When enabled, logging will be more verbose and some internal validation will be loosen.

logLevel¶

Set log level of whole system.

type: Optional[str]

values: "trace", "debug", "info", "warning", "error", "fatal"

Defaults to “info” or “debug”, depending on debug option.

Most modules of NNI will be affected by this value, including NNI manager, tuner, training service, etc.

The exception is trial, whose logging level is directly managed by trial code.

For Python modules, “trace” acts as logging level 0 and “fatal” acts as logging.CRITICAL.

experimentWorkingDirectory¶

Specify the directory to place log, checkpoint, metadata, and other run-time stuff.

type: Optional[str]

By default uses ~/nni-experiments.

NNI will create a subdirectory named by experiment ID, so it is safe to use same directory for multiple experiments.

tunerGpuIndices¶

Limit the GPUs visible to tuner, assessor, and advisor.

type: Optional[list[int] | str]

This will be the CUDA_VISIBLE_DEVICES environment variable of tuner process.

Because tuner, assessor, and advisor run in same process, this option will affect them all.

tuner¶

Specify the tuner.

type: Optional AlgorithmConfig

assessor¶

Specify the assessor.

type: Optional AlgorithmConfig

advisor¶

Specify the advisor.

type: Optional AlgorithmConfig

trainingService¶

Specify training service.

type: TrainingServiceConfig

AlgorithmConfig¶

AlgorithmConfig describes a tuner / assessor / advisor algorithm.

For custom algorithms, there are two ways to describe them:

Register the algorithm to use it like built-in. (preferred)

Specify code directory and class name directly.

name¶

Name of built-in or registered algorithm.

type: str for built-in and registered algorithm, None for other custom algorithm

className¶

Qualified class name of not registered custom algorithm.

type: None for built-in and registered algorithm, str for other custom algorithm

example: "my_tuner.MyTuner"

codeDirectory¶

Path to directory containing the custom algorithm class.

type: None for built-in and registered algorithm, str for other custom algorithm

classArgs¶

Keyword arguments passed to algorithm class’ constructor.

type: Optional[dict[str, Any]]

See algorithm’s document for supported value.

TrainingServiceConfig¶

One of following:

LocalConfig
RemoteConfig
OpenpaiConfig
AmlConfig

For other training services, we suggest to use v1 config schema for now.

LocalConfig¶

Detailed here.

platform¶

Constant string "local".

useActiveGpu¶

Specify whether NNI should submit trials to GPUs occupied by other tasks.

type: Optional[bool]

Must be set when trialGpuNumber greater than zero.

If your are using desktop system with GUI, set this to True.

maxTrialNumberPerGpu¶

Specify how many trials can share one GPU.

type: int

default: 1

gpuIndices¶

Limit the GPUs visible to trial processes.

type: Optional[list[int] | str]

If trialGpuNumber is less than the length of this value, only a subset will be visible to each trial.

This will be used as CUDA_VISIBLE_DEVICES environment variable.

RemoteConfig¶

Detailed here.

platform¶

Constant string "remote".

machineList¶

List of training machines.

type: list of RemoteMachineConfig

reuseMode¶

Enable reuse mode.

type: bool

RemoteMachineConfig¶

host¶

IP or hostname (domain name) of the machine.

type: str

port¶

SSH service port.

type: int

default: 22

user¶

Login user name.

type: str

password¶

Login password.

type: Optional[str]

If not specified, sshKeyFile will be used instead.

sshKeyFile¶

Path to sshKeyFile (identity file).

type: Optional[str]

Only used when password is not specified.

sshPassphrase¶

Passphrase of SSH identity file.

type: Optional[str]

useActiveGpu¶

Specify whether NNI should submit trials to GPUs occupied by other tasks.

type: bool

default: False

maxTrialNumberPerGpu¶

Specify how many trials can share one GPU.

type: int

default: 1

gpuIndices¶

Limit the GPUs visible to trial processes.

type: Optional[list[int] | str]

If trialGpuNumber is less than the length of this value, only a subset will be visible to each trial.

This will be used as CUDA_VISIBLE_DEVICES environment variable.

pythonPath¶

Specify a python environment, this path will insert at the front of PATH. Here are some examples:

(linux) pythonPath: /opt/python3.7/bin
(windows) pythonPath: C:/Python37

Notice: If you are working on anaconda，there are some difference. You have to add “../script” and “../Library/bin” to this and separated by “;” on windows, example as below:

(linux anaconda) pythonPath: /home/yourname/anaconda3/envs/myenv/bin/
(windows anaconda) pythonPath: C:/Users/yourname/.conda/envs/myenv;C:/Users/yourname/.conda/envs/myenv/Scripts;C:/Users/yourname/.conda/envs/myenv/Library/bin

type: Optional[str]

This is useful if preparing steps vary for different machines.

OpenpaiConfig¶

Detailed here.

platform¶

Constant string "openpai".

host¶

Hostname of OpenPAI service.

type: str

This may includes https:// or http:// prefix.

HTTPS will be used by default.

username¶

OpenPAI user name.

type: str

token¶

OpenPAI user token.

type: str

This can be found in your OpenPAI user settings page.

dockerImage¶

Name and tag of docker image to run the trials.

type: str

default: "msranni/nni:latest"

nniManagerStorageMountPoint¶

Mount point of storage service (typically NFS) on current machine.

type: str

containerStorageMountPoint¶

Mount point of storage service (typically NFS) in docker container.

type: str

This must be an absolute path.

reuseMode¶

Enable reuse mode.

type: bool

default: False

openpaiConfig¶

Embedded OpenPAI config file.

type: Optional[JSON]

openpaiConfigFile¶

Path to OpenPAI config file.

type: Optional[str]

An example can be found here

AmlConfig¶

Detailed here.

platform¶

Constant string "aml".

dockerImage¶

Name and tag of docker image to run the trials.

type: str

default: "msranni/nni:latest"

subscriptionId¶

Azure subscription ID.

type: str

resourceGroup¶

Azure resource group name.

type: str

workspaceName¶

Azure workspace name.

type: str

computeTarget¶

AML compute cluster name.

type: str