How to Launch an Experiment from Python¶

Overview¶

Since nni v2.0, we provide a new way to launch experiments. Before that, you need to configure the experiment in the yaml configuration file and then use the experiment nnictl command to launch the experiment. Now, you can also configure and run experiments directly in python file. If you are familiar with python programming, this will undoubtedly bring you more convenience.

Run a New Experiment¶

After successfully installing nni, you can start the experiment with a python script in the following 2 steps.

Step 1 - Initialize an experiment instance and configure it

from nni.experiment import Experiment
experiment = Experiment('local')

Now, you have a Experiment instance, and this experiment will launch trials on your local machine due to training_service='local'.

See all training services supported in NNI.

experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True

Use the form like experiment.config.foo = 'bar' to configure your experiment.

See all real builtin tuners supported in NNI.

See parameter configuration required by different training services.

Step 2 - Just run

experiment.run(port=8080)

Now, you have successfully launched an NNI experiment. And you can type localhost:8080 in your browser to observe your experiment in real time.

Note

In this way, experiment will run in the foreground and will automatically exit when the experiment finished. If you want to run an experiment in an interactive way, use start() in Step 2.

Example¶

Below is an example for this new launching approach. You can also find this code in mnist-tfv2/launch.py.

from pathlib import Path

from nni.experiment import Experiment

search_space = {
    "dropout_rate": { "_type": "uniform", "_value": [0.5, 0.9] },
    "conv_size": { "_type": "choice", "_value": [2, 3, 5, 7] },
    "hidden_size": { "_type": "choice", "_value": [124, 512, 1024] },
    "batch_size": { "_type": "choice", "_value": [16, 32] },
    "learning_rate": { "_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1] }
}

experiment = Experiment('local')
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True

experiment.run(8080)

Start and Manage a New Experiment¶

We migrate the API in NNI Client to this new launching approach. Launch the experiment by start() instead of run(), then you can use these APIs in interactive mode.

Please refer to example usage and code file python_api_start.ipynb.

Note

run() polls the experiment status and will automatically call stop() when the experiment finished. start() just launched a new experiment, so you need to manually stop the experiment by calling stop().

Connect and Manage an Exist Experiment¶

If you launch the experiment by nnictl and also want to use these APIs, you can use Experiment.connect() to connect to an existing experiment.

Please refer to example usage and code file python_api_connect.ipynb.

Note

You can use stop() to stop the experiment when connecting to an existing experiment.

API¶

class nni.experiment.Experiment(config: nni.experiment.config.common.ExperimentConfig)[source]¶

class nni.experiment.Experiment(training_service: Union[str, List[str]])

Create and stop an NNI experiment.

config¶: Experiment configuration.

port¶: Web UI port of the experiment, or None if it is not running.

classmethod connect(port: int)[source]¶

Connect to an existing experiment.

Parameters: port – The port of web UI.

export_data()[source]¶

Return exported information for all trial jobs.

Returns: List of TrialResult.
Return type: list

get_all_experiments_metadata()[source]¶

Return all experiments metadata as a list.

Returns: The experiments metadata.
Return type: list

get_experiment_metadata(exp_id: str)[source]¶

Return experiment metadata with specified exp_id as a dict.

Returns: The specified experiment metadata.
Return type: dict

get_experiment_profile()[source]¶

Return experiment profile as a dict.

Returns: The profile of the experiment.
Return type: dict

get_job_metrics(trial_job_id=None)[source]¶

Return trial job metrics.

Parameters: trial_job_id (str) – trial job id. if this parameter is None, all trail jobs’ metrics will be returned.
Returns: Each key is a trialJobId, the corresponding value is a list of TrialMetricData.
Return type: dict

get_job_statistics()[source]¶

Return trial job statistics information as a dict.

Returns: Job statistics information.
Return type: dict

get_status() → str[source]¶

Return experiment status as a str.

Returns: Experiment status.
Return type: str

get_trial_job(trial_job_id: str)[source]¶

Return a trial job.

Parameters: trial_job_id (str) – Trial job id.
Returns: A TrialJob instance corresponding to trial_job_id.
Return type: TrialJob

list_trial_jobs()[source]¶

Return information for all trial jobs as a list.

Returns: List of TrialJob.
Return type: list

run(port: int = 8080, debug: bool = False) → bool[source]¶

Run the experiment.

This function will block until experiment finish or error.

Return True when experiment done; or return False when experiment failed.

start(port: int = 8080, debug: bool = False) → None[source]¶

Start the experiment in background.

This method will raise exception on failure. If it returns, the experiment should have been successfully started.

Parameters

port – The port of web UI.
debug – Whether to start in debug mode.

stop() → None[source]¶: Stop background experiment.

update_max_experiment_duration(value: str)[source]¶

Update an experiment’s max_experiment_duration

Parameters: value (str) – Strings like ‘1m’ for one minute or ‘2h’ for two hours. SUFFIX may be ‘s’ for seconds, ‘m’ for minutes, ‘h’ for hours or ‘d’ for days.

update_max_trial_number(value: int)[source]¶

Update an experiment’s max_trial_number

Parameters: value (int) – New max_trial_number value.

update_search_space(value: dict)[source]¶

Update the experiment’s search_space. TODO: support searchspace file.

Parameters: value (dict) – New search_space.

update_trial_concurrency(value: int)[source]¶

Update an experiment’s trial_concurrency

Parameters: value (int) – New trial_concurrency value.