How to Launch an Experiment from Python¶
Overview¶
Since nni v2.0
, we provide a new way to launch experiments. Before that, you need to configure the experiment in the yaml configuration file and then use the experiment nnictl
command to launch the experiment. Now, you can also configure and run experiments directly in python file. If you are familiar with python programming, this will undoubtedly bring you more convenience.
Run a New Experiment¶
After successfully installing nni
, you can start the experiment with a python script in the following 2 steps.
Step 1 - Initialize an experiment instance and configure it
from nni.experiment import Experiment
experiment = Experiment('local')
Now, you have a Experiment
instance, and this experiment will launch trials on your local machine due to training_service='local'
.
See all training services supported in NNI.
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
Use the form like experiment.config.foo = 'bar'
to configure your experiment.
See all real builtin tuners supported in NNI.
See parameter configuration required by different training services.
Step 2 - Just run
experiment.run(port=8080)
Now, you have successfully launched an NNI experiment. And you can type localhost:8080
in your browser to observe your experiment in real time.
Note
In this way, experiment will run in the foreground and will automatically exit when the experiment finished. If you want to run an experiment in an interactive way, use start()
in Step 2.
Example¶
Below is an example for this new launching approach. You can also find this code in mnist-tfv2/launch.py.
from pathlib import Path
from nni.experiment import Experiment
search_space = {
"dropout_rate": { "_type": "uniform", "_value": [0.5, 0.9] },
"conv_size": { "_type": "choice", "_value": [2, 3, 5, 7] },
"hidden_size": { "_type": "choice", "_value": [124, 512, 1024] },
"batch_size": { "_type": "choice", "_value": [16, 32] },
"learning_rate": { "_type": "choice", "_value": [0.0001, 0.001, 0.01, 0.1] }
}
experiment = Experiment('local')
experiment.config.experiment_name = 'MNIST example'
experiment.config.trial_concurrency = 2
experiment.config.max_trial_number = 10
experiment.config.search_space = search_space
experiment.config.trial_command = 'python3 mnist.py'
experiment.config.trial_code_directory = Path(__file__).parent
experiment.config.tuner.name = 'TPE'
experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
experiment.config.training_service.use_active_gpu = True
experiment.run(8080)
Start and Manage a New Experiment¶
We migrate the API in NNI Client
to this new launching approach.
Launch the experiment by start()
instead of run()
, then you can use these APIs in interactive mode.
Please refer to example usage and code file python_api_start.ipynb.
Note
run()
polls the experiment status and will automatically call stop()
when the experiment finished. start()
just launched a new experiment, so you need to manually stop the experiment by calling stop()
.
Connect and Manage an Exist Experiment¶
If you launch the experiment by nnictl
and also want to use these APIs, you can use Experiment.connect()
to connect to an existing experiment.
Please refer to example usage and code file python_api_connect.ipynb.
Note
You can use stop()
to stop the experiment when connecting to an existing experiment.
API¶
-
class
nni.experiment.
Experiment
(config: nni.experiment.config.common.ExperimentConfig)[source]¶ -
class
nni.experiment.
Experiment
(training_service: Union[str, List[str]]) Create and stop an NNI experiment.
-
config
¶ Experiment configuration.
-
port
¶ Web UI port of the experiment, or None if it is not running.
-
classmethod
connect
(port: int)[source]¶ Connect to an existing experiment.
- Parameters
port – The port of web UI.
-
export_data
()[source]¶ Return exported information for all trial jobs.
- Returns
List of TrialResult.
- Return type
list
-
get_all_experiments_metadata
()[source]¶ Return all experiments metadata as a list.
- Returns
The experiments metadata.
- Return type
list
-
get_experiment_metadata
(exp_id: str)[source]¶ Return experiment metadata with specified exp_id as a dict.
- Returns
The specified experiment metadata.
- Return type
dict
-
get_experiment_profile
()[source]¶ Return experiment profile as a dict.
- Returns
The profile of the experiment.
- Return type
dict
-
get_job_metrics
(trial_job_id=None)[source]¶ Return trial job metrics.
- Parameters
trial_job_id (str) – trial job id. if this parameter is None, all trail jobs’ metrics will be returned.
- Returns
Each key is a trialJobId, the corresponding value is a list of TrialMetricData.
- Return type
dict
-
get_job_statistics
()[source]¶ Return trial job statistics information as a dict.
- Returns
Job statistics information.
- Return type
dict
-
get_status
() → str[source]¶ Return experiment status as a str.
- Returns
Experiment status.
- Return type
str
-
get_trial_job
(trial_job_id: str)[source]¶ Return a trial job.
- Parameters
trial_job_id (str) – Trial job id.
- Returns
A TrialJob instance corresponding to trial_job_id.
- Return type
TrialJob
-
list_trial_jobs
()[source]¶ Return information for all trial jobs as a list.
- Returns
List of TrialJob.
- Return type
list
-
run
(port: int = 8080, debug: bool = False) → bool[source]¶ Run the experiment.
This function will block until experiment finish or error.
Return True when experiment done; or return False when experiment failed.
-
start
(port: int = 8080, debug: bool = False) → None[source]¶ Start the experiment in background.
This method will raise exception on failure. If it returns, the experiment should have been successfully started.
- Parameters
port – The port of web UI.
debug – Whether to start in debug mode.
-
update_max_experiment_duration
(value: str)[source]¶ Update an experiment’s max_experiment_duration
- Parameters
value (str) – Strings like ‘1m’ for one minute or ‘2h’ for two hours. SUFFIX may be ‘s’ for seconds, ‘m’ for minutes, ‘h’ for hours or ‘d’ for days.
-
update_max_trial_number
(value: int)[source]¶ Update an experiment’s max_trial_number
- Parameters
value (int) – New max_trial_number value.
-