Uncategorized Modules

Experiment

class nni.retiarii.experiment.pytorch.RetiariiExeConfig(training_service_platform: Optional[str] = None, execution_engine: Union[str, nni.nas.experiment.config.engine_config.ExecutionEngineConfig] = 'py', **kwargs)[source]
class nni.retiarii.experiment.pytorch.RetiariiExperiment(base_model=None, evaluator=None, applied_mutators=None, strategy=None, trainer=None)[source]

The entry for a NAS experiment. Users can use this class to start/stop or inspect an experiment, like exporting the results.

Experiment is a sub-class of nni.experiment.Experiment, there are many similarities such as configurable training service to distributed running the experiment on remote server. But unlike nni.experiment.Experiment, RetiariiExperiment doesn’t support configure:

  • trial_code_directory, which can only be current working directory.

  • search_space, which is auto-generated in NAS.

  • trial_command, which must be python -m nni.retiarii.trial_entry to launch the modulized trial code.

RetiariiExperiment also doesn’t have tuner/assessor/advisor, because they are also implemented in strategy.

Also, unlike nni.experiment.Experiment which is bounded to a node server, RetiariiExperiment optionally starts a node server to schedule the trials, when the strategy is a multi-trial strategy. When the strategy is one-shot, the step of launching node server is omitted, and the experiment is run locally by default.

Configurations of experiments, such as execution engine, number of GPUs allocated, should be put into a RetiariiExeConfig and used as an argument of RetiariiExperiment.run().

Parameters:
  • base_model (nn.Module) – The model defining the search space / base skeleton without mutation. It should be wrapped by decorator nni.retiarii.model_wrapper.

  • evaluator (nni.retiarii.Evaluator, default = None) – Evaluator for the experiment. If you are using a one-shot trainer, it should be placed here, although this usage is deprecated.

  • applied_mutators (list of nni.retiarii.Mutator, default = None) – Mutators os mutate the base model. If none, mutators are skipped. Note that when base_model uses inline mutations (e.g., LayerChoice), applied_mutators must be empty / none.

  • strategy (nni.retiarii.strategy.BaseStrategy, default = None) – Exploration strategy. Can be multi-trial or one-shot.

  • trainer (BaseOneShotTrainer) – Kept for compatibility purposes.

Examples

Multi-trial NAS:

>>> base_model = Net()
>>> search_strategy = strategy.Random()
>>> model_evaluator = FunctionalEvaluator(evaluate_model)
>>> exp = RetiariiExperiment(base_model, model_evaluator, [], search_strategy)
>>> exp_config = RetiariiExeConfig('local')
>>> exp_config.trial_concurrency = 2
>>> exp_config.max_trial_number = 20
>>> exp_config.training_service.use_active_gpu = False
>>> exp.run(exp_config, 8081)

One-shot NAS:

>>> base_model = Net()
>>> search_strategy = strategy.DARTS()
>>> evaluator = pl.Classification(train_dataloader=train_loader, val_dataloaders=valid_loader)
>>> exp = RetiariiExperiment(base_model, evaluator, [], search_strategy)
>>> exp_config = RetiariiExeConfig()
>>> exp_config.execution_engine = 'oneshot'  # must be set of one-shot strategy
>>> exp.run(exp_config)

Export top models:

>>> for model_dict in exp.export_top_models(formatter='dict'):
...     print(model_dict)
>>> with nni.retarii.fixed_arch(model_dict):
...     final_model = Net()
export_top_models(top_k=1, optimize_mode='maximize', formatter='dict')[source]

Export several top performing models.

For one-shot algorithms, only top-1 is supported. For others, optimize_mode and formatter are available for customization.

The concrete behavior of export depends on each strategy. See the documentation of each strategy for detailed specifications.

Parameters:
  • top_k (int) – How many models are intended to be exported.

  • optimize_mode (str) – maximize or minimize. Not supported by one-shot algorithms. optimize_mode is likely to be removed and defined in strategy in future.

  • formatter (str) – Support code and dict. Not supported by one-shot algorithms. If code, the python code of model will be returned. If dict, the mutation history will be returned.

static resume(experiment_id, port=8080, debug=False)[source]

Resume a stopped experiment.

Parameters:
  • experiment_id (str) – The stopped experiment id.

  • port (int) – The port of web UI.

  • debug (bool) – Whether to start in debug mode.

run(config=None, port=8080, debug=False)[source]

Run the experiment. This function will block until experiment finish or error.

start(*args, **kwargs)[source]

By design, the only different between start and run is that start is asynchronous, while run waits the experiment to complete. RetiariiExperiment always waits the experiment to complete as strategy runs in foreground.

stop()[source]

Stop background experiment.

static view(experiment_id, port=8080, non_blocking=False)[source]

View a stopped experiment.

Parameters:
  • experiment_id (str) – The stopped experiment id.

  • port (int) – The port of web UI.

  • non_blocking (bool) – If false, run in the foreground. If true, run in the background.

NAS Benchmarks

NAS-Bench-101

class nni.nas.benchmarks.nasbench101.Nb101IntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NAS-Bench-101.

trial

The exact trial where the intermediate result is produced.

Type:

Nb101TrialStats

current_epoch

Elapsed epochs when evaluation is done.

Type:

int

train_acc

Intermediate accuracy on training data, ranging from 0 to 100.

Type:

float

valid_acc

Intermediate accuracy on validation data, ranging from 0 to 100.

Type:

float

test_acc

Intermediate accuracy on test data, ranging from 0 to 100.

Type:

float

training_time

Time elapsed in seconds.

Type:

float

class nni.nas.benchmarks.nasbench101.Nb101TrialConfig(*args, **kwargs)[source]

Trial config for NAS-Bench-101.

arch

A dict with keys op1, op2, … and input1, input2, … Vertices are enumerate from 0. Since node 0 is input node, it is skipped in this dict. Each op is one of nni.nas.benchmark.nasbench101.CONV3X3_BN_RELU, nni.nas.benchmark.nasbench101.CONV1X1_BN_RELU, and nni.nas.benchmark.nasbench101.MAXPOOL3X3. Each input is a list of previous nodes. For example input5 can be [0, 1, 3].

Type:

dict

num_vertices

Number of vertices (nodes) in one cell. Should be less than or equal to 7 in default setup.

Type:

int

hash

Graph-invariant MD5 string for this architecture.

Type:

str

num_epochs

Number of epochs planned for this trial. Should be one of 4, 12, 36, 108 in default setup.

Type:

int

class nni.nas.benchmarks.nasbench101.Nb101TrialStats(*args, **kwargs)[source]

Computation statistics for NAS-Bench-101. Each corresponds to one trial. Each config has multiple trials with different random seeds, but unfortunately seed for each trial is unavailable. NAS-Bench-101 trains and evaluates on CIFAR-10 by default. The original training set is divided into 40k training images and 10k validation images, and the original validation set is used for test only.

config

Setup for this trial data.

Type:

Nb101TrialConfig

train_acc

Final accuracy on training data, ranging from 0 to 100.

Type:

float

valid_acc

Final accuracy on validation data, ranging from 0 to 100.

Type:

float

test_acc

Final accuracy on test data, ranging from 0 to 100.

Type:

float

parameters

Number of trainable parameters in million.

Type:

float

training_time

Duration of training in seconds.

Type:

float

nni.nas.benchmarks.nasbench101.query_nb101_trial_stats(arch, num_epochs, isomorphism=True, reduction=None, include_intermediates=False)[source]

Query trial stats of NAS-Bench-101 given conditions.

Parameters:
  • arch (dict or None) – If a dict, it is in the format that is described in nni.nas.benchmark.nasbench101.Nb101TrialConfig. Only trial stats matched will be returned. If none, all architectures in the database will be matched.

  • num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.

  • isomorphism (boolean) – Whether to match essentially-same architecture, i.e., architecture with the same graph-invariant hash value.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns:

A generator of nni.nas.benchmark.nasbench101.Nb101TrialStats objects, where each of them has been converted into a dict.

Return type:

generator of dict

NAS-Bench-201

class nni.nas.benchmarks.nasbench201.Nb201IntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NAS-Bench-201.

trial

Corresponding trial.

Type:

Nb201TrialStats

current_epoch

Elapsed epochs.

Type:

int

train_acc

Current accuracy on training data, ranging from 0 to 100.

Type:

float

valid_acc

Current accuracy on validation data, ranging from 0 to 100.

Type:

float

test_acc

Current accuracy on test data, ranging from 0 to 100.

Type:

float

ori_test_acc

Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.

Type:

float

train_loss

Current cross entropy loss on training data.

Type:

float or None

valid_loss

Current cross entropy loss on validation data.

Type:

float or None

test_loss

Current cross entropy loss on test data.

Type:

float or None

ori_test_loss

Current cross entropy loss on original validation set.

Type:

float or None

class nni.nas.benchmarks.nasbench201.Nb201TrialConfig(*args, **kwargs)[source]

Trial config for NAS-Bench-201.

arch

A dict with keys 0_1, 0_2, 0_3, 1_2, 1_3, 2_3, each of which is an operator chosen from nni.nas.benchmark.nasbench201.NONE, nni.nas.benchmark.nasbench201.SKIP_CONNECT, nni.nas.benchmark.nasbench201.CONV_1X1, nni.nas.benchmark.nasbench201.CONV_3X3 and nni.nas.benchmark.nasbench201.AVG_POOL_3X3.

Type:

dict

num_epochs

Number of epochs planned for this trial. Should be one of 12 and 200.

Type:

int

num_channels

Number of channels for initial convolution. 16 by default.

Type:

int

num_cells

Number of cells per stage. 5 by default.

Type:

int

dataset

Dataset used for training and evaluation. NAS-Bench-201 provides the following 4 options: cifar10-valid (training data is splited into 25k for training and 25k for validation, validation data is used for test), cifar10 (training data is used in training, validation data is splited into 5k for validation and 5k for testing), cifar100 (same protocol as cifar10), and imagenet16-120 (a subset of 120 classes in ImageNet, downscaled to 16x16, using training data for training, 6k images from validation set for validation and the other 6k for testing).

Type:

str

class nni.nas.benchmarks.nasbench201.Nb201TrialStats(*args, **kwargs)[source]

Computation statistics for NAS-Bench-201. Each corresponds to one trial.

config

Setup for this trial data.

Type:

Nb201TrialConfig

seed

Random seed selected, for reproduction.

Type:

int

train_acc

Final accuracy on training data, ranging from 0 to 100.

Type:

float

valid_acc

Final accuracy on validation data, ranging from 0 to 100.

Type:

float

test_acc

Final accuracy on test data, ranging from 0 to 100.

Type:

float

ori_test_acc

Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.

Type:

float

train_loss

Final cross entropy loss on training data. Note that loss could be NaN, in which case this attributed will be None.

Type:

float or None

valid_loss

Final cross entropy loss on validation data.

Type:

float or None

test_loss

Final cross entropy loss on test data.

Type:

float or None

ori_test_loss

Final cross entropy loss on original validation set.

Type:

float or None

parameters

Number of trainable parameters in million.

Type:

float

latency

Latency in seconds.

Type:

float

flops

FLOPs in million.

Type:

float

training_time

Duration of training in seconds.

Type:

float

valid_evaluation_time

Time elapsed to evaluate on validation set.

Type:

float

test_evaluation_time

Time elapsed to evaluate on test set.

Type:

float

ori_test_evaluation_time

Time elapsed to evaluate on original test set.

Type:

float

nni.nas.benchmarks.nasbench201.query_nb201_trial_stats(arch, num_epochs, dataset, reduction=None, include_intermediates=False)[source]

Query trial stats of NAS-Bench-201 given conditions.

Parameters:
  • arch (dict or None) – If a dict, it is in the format that is described in nni.nas.benchmark.nasbench201.Nb201TrialConfig. Only trial stats matched will be returned. If none, all architectures in the database will be matched.

  • num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.

  • dataset (str or None) – If specified, can be one of the dataset available in nni.nas.benchmark.nasbench201.Nb201TrialConfig. Otherwise a wildcard.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns:

A generator of nni.nas.benchmark.nasbench201.Nb201TrialStats objects, where each of them has been converted into a dict.

Return type:

generator of dict

NDS

class nni.nas.benchmarks.nds.NdsIntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NDS.

trial

Corresponding trial.

Type:

NdsTrialStats

current_epoch

Elapsed epochs.

Type:

int

train_loss

Current cross entropy loss on training data. Can be NaN (None).

Type:

float or None

train_acc

Current accuracy on training data, ranging from 0 to 100.

Type:

float

test_acc

Current accuracy on test data, ranging from 0 to 100.

Type:

float

class nni.nas.benchmarks.nds.NdsTrialConfig(*args, **kwargs)[source]

Trial config for NDS.

model_family

Could be nas_cell, residual_bottleneck, residual_basic or vanilla.

Type:

str

model_spec

If model_family is nas_cell, it contains num_nodes_normal, num_nodes_reduce, depth, width, aux and drop_prob. If model_family is residual_bottleneck, it contains bot_muls, ds (depths), num_gs (number of groups) and ss (strides). If model_family is residual_basic or vanilla, it contains ds, ss and ws.

Type:

dict

cell_spec

If model_family is not nas_cell it will be an empty dict. Otherwise, it specifies <normal/reduce>_<i>_<op/input>_<x/y>, where i ranges from 0 to num_nodes_<normal/reduce> - 1. If it is an op, the value is chosen from the constants specified previously like nni.nas.benchmark.nds.CONV_1X1. If it is i’s input, the value range from 0 to i + 1, as nas_cell uses previous two nodes as inputs, and node 0 is actually the second node. Refer to NASNet paper for details. Finally, another two key-value pairs normal_concat and reduce_concat specify which nodes are eventually concatenated into output.

Type:

dict

dataset

Dataset used. Could be cifar10 or imagenet.

Type:

str

generator

Can be one of random which generates configurations at random, while keeping learning rate and weight decay fixed, fix_w_d which further keeps width and depth fixed, only applicable for nas_cell. tune_lr_wd which further tunes learning rate and weight decay.

Type:

str

proposer

Paper who has proposed the distribution for random sampling. Available proposers include nasnet, darts, enas, pnas, amoeba, vanilla, resnext-a, resnext-b, resnet, resnet-b (ResNet with bottleneck). See NDS paper for details.

Type:

str

base_lr

Initial learning rate.

Type:

float

weight_decay

L2 weight decay applied on weights.

Type:

float

num_epochs

Number of epochs scheduled, during which learning rate will decay to 0 following cosine annealing.

Type:

int

class nni.nas.benchmarks.nds.NdsTrialStats(*args, **kwargs)[source]

Computation statistics for NDS. Each corresponds to one trial.

config

Corresponding config for trial.

Type:

NdsTrialConfig

seed

Random seed selected, for reproduction.

Type:

int

final_train_acc

Final accuracy on training data, ranging from 0 to 100.

Type:

float

final_train_loss

Final cross entropy loss on training data. Could be NaN (None).

Type:

float or None

final_test_acc

Final accuracy on test data, ranging from 0 to 100.

Type:

float

best_train_acc

Best accuracy on training data, ranging from 0 to 100.

Type:

float

best_train_loss

Best cross entropy loss on training data. Could be NaN (None).

Type:

float or None

best_test_acc

Best accuracy on test data, ranging from 0 to 100.

Type:

float

parameters

Number of trainable parameters in million.

Type:

float

flops

FLOPs in million.

Type:

float

iter_time

Seconds elapsed for each iteration.

Type:

float

nni.nas.benchmarks.nds.query_nds_trial_stats(model_family, proposer, generator, model_spec, cell_spec, dataset, num_epochs=None, reduction=None, include_intermediates=False)[source]

Query trial stats of NDS given conditions.

Parameters:
  • model_family (str or None) – If str, can be one of the model families available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • proposer (str or None) – If str, can be one of the proposers available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • generator (str or None) – If str, can be one of the generators available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • model_spec (dict or None) – If specified, can be one of the model spec available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • cell_spec (dict or None) – If specified, can be one of the cell spec available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • dataset (str or None) – If str, can be one of the datasets available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • num_epochs (float or None) – If int, matching results will be returned. Otherwise a wildcard.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns:

A generator of nni.nas.benchmark.nds.NdsTrialStats objects, where each of them has been converted into a dict.

Return type:

generator of dict

Retrain (Architecture Evaluation)

nni.retiarii.fixed_arch(fixed_arch, verbose=True)[source]

Load architecture from fixed_arch and apply to model. This should be used as a context manager. For example,

with fixed_arch('/path/to/export.json'):
    model = Model(3, 224, 224)
Parameters:
  • fixed_arc (str, Path or dict) – Path to the JSON that stores the architecture, or dict that stores the exported architecture.

  • verbose (bool) – Print log messages if set to True

Returns:

Context manager that provides a fixed architecture when creates the model.

Return type:

ContextStack

Utilities

nni.retiarii.basic_unit(cls, basic_unit_tag=True)[source]

To wrap a module as a basic unit, is to make it a primitive and stop the engine from digging deeper into it.

basic_unit_tag is true by default. If set to false, it will not be explicitly mark as a basic unit, and graph parser will continue to parse. Currently, this is to handle a special case in nn.Sequential.

Although basic_unit calls trace in its implementation, it is not for serialization. Rather, it is meant to capture the initialization arguments for mutation. Also, graph execution engine will stop digging into the inner modules when it reaches a module that is decorated with basic_unit.

@basic_unit
class PrimitiveOp(nn.Module):
    ...
nni.retiarii.model_wrapper(cls)[source]

Wrap the base model (search space). For example,

@model_wrapper
class MyModel(nn.Module):
    ...

The wrapper serves two purposes:

  1. Capture the init parameters of python class so that it can be re-instantiated in another process.

  2. Reset uid in namespace so that the auto label counting in each model stably starts from zero.

Currently, NNI might not complain in simple cases where @model_wrapper is actually not needed. But in future, we might enforce @model_wrapper to be required for base model.

class nni.retiarii.nn.pytorch.mutation_utils.Mutable[source]

This is just an implementation trick for now.

In future, this could be the base class for all PyTorch mutables including layer choice, input choice, etc. This is not considered as an interface, but rather as a base class consisting of commonly used class/instance methods. For API developers, it’s not recommended to use isinstance(module, Mutable) to check for mutable modules either, before the design is finalized.

classmethod create_fixed_module(*args, **kwargs)[source]

Try to create a fixed module from fixed dict. If the code is running in a trial, this method would succeed, and a concrete module instead of a mutable will be created. Raises no context error if the creation failed.

class nni.retiarii.utils.ContextStack(key, value)[source]

This is to maintain a globally-accessible context environment that is visible to everywhere.

Use with ContextStack(namespace, value): to initiate, and use get_current_context(namespace) to get the corresponding value in the namespace.

Note that this is not multi-processing safe. Also, the values will get cleared for a new process.

class nni.retiarii.utils.ModelNamespace(key='model')[source]

To create an individual namespace for models:

  1. to enable automatic numbering;

  2. to trace general information (like creation of hyper-parameters) of model.

A namespace is bounded to a key. Namespace bounded to different keys are completed isolated. Namespace can have sub-namespaces (with the same key). The numbering will be chained (e.g., model_1_4_2).

static current_context(key='model')[source]

Get the current context in key.

static next_label(key='model')[source]

Get the next label for API calls, with automatic numbering.

exception nni.retiarii.utils.NoContextError[source]

Exception raised when context is missing.

nni.retiarii.utils.original_state_dict_hooks(model)[source]

Use this patch if you want to save/load state dict in the original state dict hierarchy.

For example, when you already have a state dict for the base model / search space (which often happens when you have trained a supernet with one-shot strategies), the state dict isn’t organized in the same way as when a sub-model is sampled from the search space. This patch will help the modules in the sub-model find the corresponding module in the base model.

The code looks like,

with original_state_dict_hooks(model):
    model.load_state_dict(state_dict_from_supernet, strict=False)  # supernet has extra keys

Or vice-versa,

with original_state_dict_hooks(model):
    supernet_style_state_dict = model.state_dict()