Uncategorized Modules

Experiment

class nni.retiarii.experiment.pytorch.RetiariiExeConfig(training_service_platform: Optional[str] = None, execution_engine: Union[str, nni.retiarii.experiment.config.engine_config.ExecutionEngineConfig] = 'py', **kwargs)[source]
class nni.retiarii.experiment.pytorch.RetiariiExperiment(base_model, evaluator=None, applied_mutators=None, strategy=None, trainer=None)[source]

The entry for a NAS experiment. Users can use this class to start/stop or inspect an experiment, like exporting the results.

Experiment is a sub-class of nni.experiment.Experiment, there are many similarities such as configurable training service to distributed running the experiment on remote server. But unlike nni.experiment.Experiment, RetiariiExperiment doesn’t support configure:

  • trial_code_directory, which can only be current working directory.

  • search_space, which is auto-generated in NAS.

  • trial_command, which must be python -m nni.retiarii.trial_entry to launch the modulized trial code.

RetiariiExperiment also doesn’t have tuner/assessor/advisor, because they are also implemented in strategy.

Also, unlike nni.experiment.Experiment which is bounded to a node server, RetiariiExperiment optionally starts a node server to schedule the trials, when the strategy is a multi-trial strategy. When the strategy is one-shot, the step of launching node server is omitted, and the experiment is run locally by default.

Configurations of experiments, such as execution engine, number of GPUs allocated, should be put into a RetiariiExeConfig and used as an argument of RetiariiExperiment.run().

Parameters
  • base_model (nn.Module) – The model defining the search space / base skeleton without mutation. It should be wrapped by decorator nni.retiarii.model_wrapper.

  • evaluator (nni.retiarii.Evaluator, default = None) – Evaluator for the experiment. If you are using a one-shot trainer, it should be placed here, although this usage is deprecated.

  • applied_mutators (list of nni.retiarii.Mutator, default = None) – Mutators os mutate the base model. If none, mutators are skipped. Note that when base_model uses inline mutations (e.g., LayerChoice), applied_mutators must be empty / none.

  • strategy (nni.retiarii.strategy.BaseStrategy, default = None) – Exploration strategy. Can be multi-trial or one-shot.

  • trainer (BaseOneShotTrainer) – Kept for compatibility purposes.

Examples

Multi-trial NAS:

>>> base_model = Net()
>>> search_strategy = strategy.Random()
>>> model_evaluator = FunctionalEvaluator(evaluate_model)
>>> exp = RetiariiExperiment(base_model, model_evaluator, [], search_strategy)
>>> exp_config = RetiariiExeConfig('local')
>>> exp_config.trial_concurrency = 2
>>> exp_config.max_trial_number = 20
>>> exp_config.training_service.use_active_gpu = False
>>> exp.run(exp_config, 8081)

One-shot NAS:

>>> base_model = Net()
>>> search_strategy = strategy.DARTS()
>>> evaluator = pl.Classification(train_dataloader=train_loader, val_dataloaders=valid_loader)
>>> exp = RetiariiExperiment(base_model, evaluator, [], search_strategy)
>>> exp_config = RetiariiExeConfig()
>>> exp_config.execution_engine = 'oneshot'  # must be set of one-shot strategy
>>> exp.run(exp_config)

Export top models:

>>> for model_dict in exp.export_top_models(formatter='dict'):
...     print(model_dict)
>>> with nni.retarii.fixed_arch(model_dict):
...     final_model = Net()
export_top_models(top_k=1, optimize_mode='maximize', formatter='dict')[source]

Export several top performing models.

For one-shot algorithms, only top-1 is supported. For others, optimize_mode and formatter are available for customization.

The concrete behavior of export depends on each strategy. See the documentation of each strategy for detailed specifications.

Parameters
  • top_k (int) – How many models are intended to be exported.

  • optimize_mode (str) – maximize or minimize. Not supported by one-shot algorithms. optimize_mode is likely to be removed and defined in strategy in future.

  • formatter (str) – Support code and dict. Not supported by one-shot algorithms. If code, the python code of model will be returned. If dict, the mutation history will be returned.

run(config=None, port=8080, debug=False)[source]

Run the experiment. This function will block until experiment finish or error.

start(*args, **kwargs)[source]

By design, the only different between start and run is that start is asynchronous, while run waits the experiment to complete. RetiariiExperiment always waits the experiment to complete as strategy runs in foreground.

stop()[source]

Stop background experiment.

NAS Benchmarks

NAS-Bench-101

class nni.nas.benchmarks.nasbench101.Nb101IntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NAS-Bench-101.

trial

The exact trial where the intermediate result is produced.

Type

Nb101TrialStats

current_epoch

Elapsed epochs when evaluation is done.

Type

int

train_acc

Intermediate accuracy on training data, ranging from 0 to 100.

Type

float

valid_acc

Intermediate accuracy on validation data, ranging from 0 to 100.

Type

float

test_acc

Intermediate accuracy on test data, ranging from 0 to 100.

Type

float

training_time

Time elapsed in seconds.

Type

float

class nni.nas.benchmarks.nasbench101.Nb101TrialConfig(*args, **kwargs)[source]

Trial config for NAS-Bench-101.

arch

A dict with keys op1, op2, … and input1, input2, … Vertices are enumerate from 0. Since node 0 is input node, it is skipped in this dict. Each op is one of nni.nas.benchmark.nasbench101.CONV3X3_BN_RELU, nni.nas.benchmark.nasbench101.CONV1X1_BN_RELU, and nni.nas.benchmark.nasbench101.MAXPOOL3X3. Each input is a list of previous nodes. For example input5 can be [0, 1, 3].

Type

dict

num_vertices

Number of vertices (nodes) in one cell. Should be less than or equal to 7 in default setup.

Type

int

hash

Graph-invariant MD5 string for this architecture.

Type

str

num_epochs

Number of epochs planned for this trial. Should be one of 4, 12, 36, 108 in default setup.

Type

int

class nni.nas.benchmarks.nasbench101.Nb101TrialStats(*args, **kwargs)[source]

Computation statistics for NAS-Bench-101. Each corresponds to one trial. Each config has multiple trials with different random seeds, but unfortunately seed for each trial is unavailable. NAS-Bench-101 trains and evaluates on CIFAR-10 by default. The original training set is divided into 40k training images and 10k validation images, and the original validation set is used for test only.

config

Setup for this trial data.

Type

Nb101TrialConfig

train_acc

Final accuracy on training data, ranging from 0 to 100.

Type

float

valid_acc

Final accuracy on validation data, ranging from 0 to 100.

Type

float

test_acc

Final accuracy on test data, ranging from 0 to 100.

Type

float

parameters

Number of trainable parameters in million.

Type

float

training_time

Duration of training in seconds.

Type

float

nni.nas.benchmarks.nasbench101.query_nb101_trial_stats(arch, num_epochs, isomorphism=True, reduction=None, include_intermediates=False)[source]

Query trial stats of NAS-Bench-101 given conditions.

Parameters
  • arch (dict or None) – If a dict, it is in the format that is described in nni.nas.benchmark.nasbench101.Nb101TrialConfig. Only trial stats matched will be returned. If none, all architectures in the database will be matched.

  • num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.

  • isomorphism (boolean) – Whether to match essentially-same architecture, i.e., architecture with the same graph-invariant hash value.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns

A generator of nni.nas.benchmark.nasbench101.Nb101TrialStats objects, where each of them has been converted into a dict.

Return type

generator of dict

NAS-Bench-201

class nni.nas.benchmarks.nasbench201.Nb201IntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NAS-Bench-201.

trial

Corresponding trial.

Type

Nb201TrialStats

current_epoch

Elapsed epochs.

Type

int

train_acc

Current accuracy on training data, ranging from 0 to 100.

Type

float

valid_acc

Current accuracy on validation data, ranging from 0 to 100.

Type

float

test_acc

Current accuracy on test data, ranging from 0 to 100.

Type

float

ori_test_acc

Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.

Type

float

train_loss

Current cross entropy loss on training data.

Type

float or None

valid_loss

Current cross entropy loss on validation data.

Type

float or None

test_loss

Current cross entropy loss on test data.

Type

float or None

ori_test_loss

Current cross entropy loss on original validation set.

Type

float or None

class nni.nas.benchmarks.nasbench201.Nb201TrialConfig(*args, **kwargs)[source]

Trial config for NAS-Bench-201.

arch

A dict with keys 0_1, 0_2, 0_3, 1_2, 1_3, 2_3, each of which is an operator chosen from nni.nas.benchmark.nasbench201.NONE, nni.nas.benchmark.nasbench201.SKIP_CONNECT, nni.nas.benchmark.nasbench201.CONV_1X1, nni.nas.benchmark.nasbench201.CONV_3X3 and nni.nas.benchmark.nasbench201.AVG_POOL_3X3.

Type

dict

num_epochs

Number of epochs planned for this trial. Should be one of 12 and 200.

Type

int

num_channels

Number of channels for initial convolution. 16 by default.

Type

int

num_cells

Number of cells per stage. 5 by default.

Type

int

dataset

Dataset used for training and evaluation. NAS-Bench-201 provides the following 4 options: cifar10-valid (training data is splited into 25k for training and 25k for validation, validation data is used for test), cifar10 (training data is used in training, validation data is splited into 5k for validation and 5k for testing), cifar100 (same protocol as cifar10), and imagenet16-120 (a subset of 120 classes in ImageNet, downscaled to 16x16, using training data for training, 6k images from validation set for validation and the other 6k for testing).

Type

str

class nni.nas.benchmarks.nasbench201.Nb201TrialStats(*args, **kwargs)[source]

Computation statistics for NAS-Bench-201. Each corresponds to one trial.

config

Setup for this trial data.

Type

Nb201TrialConfig

seed

Random seed selected, for reproduction.

Type

int

train_acc

Final accuracy on training data, ranging from 0 to 100.

Type

float

valid_acc

Final accuracy on validation data, ranging from 0 to 100.

Type

float

test_acc

Final accuracy on test data, ranging from 0 to 100.

Type

float

ori_test_acc

Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.

Type

float

train_loss

Final cross entropy loss on training data. Note that loss could be NaN, in which case this attributed will be None.

Type

float or None

valid_loss

Final cross entropy loss on validation data.

Type

float or None

test_loss

Final cross entropy loss on test data.

Type

float or None

ori_test_loss

Final cross entropy loss on original validation set.

Type

float or None

parameters

Number of trainable parameters in million.

Type

float

latency

Latency in seconds.

Type

float

flops

FLOPs in million.

Type

float

training_time

Duration of training in seconds.

Type

float

valid_evaluation_time

Time elapsed to evaluate on validation set.

Type

float

test_evaluation_time

Time elapsed to evaluate on test set.

Type

float

ori_test_evaluation_time

Time elapsed to evaluate on original test set.

Type

float

nni.nas.benchmarks.nasbench201.query_nb201_trial_stats(arch, num_epochs, dataset, reduction=None, include_intermediates=False)[source]

Query trial stats of NAS-Bench-201 given conditions.

Parameters
  • arch (dict or None) – If a dict, it is in the format that is described in nni.nas.benchmark.nasbench201.Nb201TrialConfig. Only trial stats matched will be returned. If none, all architectures in the database will be matched.

  • num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.

  • dataset (str or None) – If specified, can be one of the dataset available in nni.nas.benchmark.nasbench201.Nb201TrialConfig. Otherwise a wildcard.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns

A generator of nni.nas.benchmark.nasbench201.Nb201TrialStats objects, where each of them has been converted into a dict.

Return type

generator of dict

NDS

class nni.nas.benchmarks.nds.NdsIntermediateStats(*args, **kwargs)[source]

Intermediate statistics for NDS.

trial

Corresponding trial.

Type

NdsTrialStats

current_epoch

Elapsed epochs.

Type

int

train_loss

Current cross entropy loss on training data. Can be NaN (None).

Type

float or None

train_acc

Current accuracy on training data, ranging from 0 to 100.

Type

float

test_acc

Current accuracy on test data, ranging from 0 to 100.

Type

float

class nni.nas.benchmarks.nds.NdsTrialConfig(*args, **kwargs)[source]

Trial config for NDS.

model_family

Could be nas_cell, residual_bottleneck, residual_basic or vanilla.

Type

str

model_spec

If model_family is nas_cell, it contains num_nodes_normal, num_nodes_reduce, depth, width, aux and drop_prob. If model_family is residual_bottleneck, it contains bot_muls, ds (depths), num_gs (number of groups) and ss (strides). If model_family is residual_basic or vanilla, it contains ds, ss and ws.

Type

dict

cell_spec

If model_family is not nas_cell it will be an empty dict. Otherwise, it specifies <normal/reduce>_<i>_<op/input>_<x/y>, where i ranges from 0 to num_nodes_<normal/reduce> - 1. If it is an op, the value is chosen from the constants specified previously like nni.nas.benchmark.nds.CONV_1X1. If it is i’s input, the value range from 0 to i + 1, as nas_cell uses previous two nodes as inputs, and node 0 is actually the second node. Refer to NASNet paper for details. Finally, another two key-value pairs normal_concat and reduce_concat specify which nodes are eventually concatenated into output.

Type

dict

dataset

Dataset used. Could be cifar10 or imagenet.

Type

str

generator

Can be one of random which generates configurations at random, while keeping learning rate and weight decay fixed, fix_w_d which further keeps width and depth fixed, only applicable for nas_cell. tune_lr_wd which further tunes learning rate and weight decay.

Type

str

proposer

Paper who has proposed the distribution for random sampling. Available proposers include nasnet, darts, enas, pnas, amoeba, vanilla, resnext-a, resnext-b, resnet, resnet-b (ResNet with bottleneck). See NDS paper for details.

Type

str

base_lr

Initial learning rate.

Type

float

weight_decay

L2 weight decay applied on weights.

Type

float

num_epochs

Number of epochs scheduled, during which learning rate will decay to 0 following cosine annealing.

Type

int

class nni.nas.benchmarks.nds.NdsTrialStats(*args, **kwargs)[source]

Computation statistics for NDS. Each corresponds to one trial.

config

Corresponding config for trial.

Type

NdsTrialConfig

seed

Random seed selected, for reproduction.

Type

int

final_train_acc

Final accuracy on training data, ranging from 0 to 100.

Type

float

final_train_loss

Final cross entropy loss on training data. Could be NaN (None).

Type

float or None

final_test_acc

Final accuracy on test data, ranging from 0 to 100.

Type

float

best_train_acc

Best accuracy on training data, ranging from 0 to 100.

Type

float

best_train_loss

Best cross entropy loss on training data. Could be NaN (None).

Type

float or None

best_test_acc

Best accuracy on test data, ranging from 0 to 100.

Type

float

parameters

Number of trainable parameters in million.

Type

float

flops

FLOPs in million.

Type

float

iter_time

Seconds elapsed for each iteration.

Type

float

nni.nas.benchmarks.nds.query_nds_trial_stats(model_family, proposer, generator, model_spec, cell_spec, dataset, num_epochs=None, reduction=None, include_intermediates=False)[source]

Query trial stats of NDS given conditions.

Parameters
  • model_family (str or None) – If str, can be one of the model families available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • proposer (str or None) – If str, can be one of the proposers available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • generator (str or None) – If str, can be one of the generators available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • model_spec (dict or None) – If specified, can be one of the model spec available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • cell_spec (dict or None) – If specified, can be one of the cell spec available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • dataset (str or None) – If str, can be one of the datasets available in nni.nas.benchmark.nds.NdsTrialConfig. Otherwise a wildcard.

  • num_epochs (float or None) – If int, matching results will be returned. Otherwise a wildcard.

  • reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.

  • include_intermediates (boolean) – If true, intermediate results will be returned.

Returns

A generator of nni.nas.benchmark.nds.NdsTrialStats objects, where each of them has been converted into a dict.

Return type

generator of dict

Retrain (Architecture Evaluation)

nni.retiarii.fixed_arch(fixed_arch, verbose=True)[source]

Load architecture from fixed_arch and apply to model. This should be used as a context manager. For example,

with fixed_arch('/path/to/export.json'):
    model = Model(3, 224, 224)
Parameters
  • fixed_arc (str, Path or dict) – Path to the JSON that stores the architecture, or dict that stores the exported architecture.

  • verbose (bool) – Print log messages if set to True

Returns

Context manager that provides a fixed architecture when creates the model.

Return type

ContextStack

Utilities

nni.retiarii.basic_unit(cls, basic_unit_tag=True)[source]

To wrap a module as a basic unit, is to make it a primitive and stop the engine from digging deeper into it.

basic_unit_tag is true by default. If set to false, it will not be explicitly mark as a basic unit, and graph parser will continue to parse. Currently, this is to handle a special case in nn.Sequential.

Although basic_unit calls trace in its implementation, it is not for serialization. Rather, it is meant to capture the initialization arguments for mutation. Also, graph execution engine will stop digging into the inner modules when it reaches a module that is decorated with basic_unit.

@basic_unit
class PrimitiveOp(nn.Module):
    ...
nni.retiarii.model_wrapper(cls)[source]

Wrap the base model (search space). For example,

@model_wrapper
class MyModel(nn.Module):
    ...

The wrapper serves two purposes:

  1. Capture the init parameters of python class so that it can be re-instantiated in another process.

  2. Reset uid in namespace so that the auto label counting in each model stably starts from zero.

Currently, NNI might not complain in simple cases where @model_wrapper is actually not needed. But in future, we might enforce @model_wrapper to be required for base model.

class nni.retiarii.nn.pytorch.mutation_utils.ModelNamespace(key='model')[source]

To create an individual namespace for models:

  1. to enable automatic numbering;

  2. to trace general information (like creation of hyper-parameters) of model.

A namespace is bounded to a key. Namespace bounded to different keys are completed isolated. Namespace can have sub-namespaces (with the same key). The numbering will be chained (e.g., model_1_4_2).

static current_context(key='model')[source]

Get the current context in key.

static next_label(key='model')[source]

Get the next label for API calls, with automatic numbering.

class nni.retiarii.nn.pytorch.mutation_utils.Mutable[source]

This is just an implementation trick for now.

In future, this could be the base class for all PyTorch mutables including layer choice, input choice, etc. This is not considered as an interface, but rather as a base class consisting of commonly used class/instance methods. For API developers, it’s not recommended to use isinstance(module, Mutable) to check for mutable modules either, before the design is finalized.

classmethod create_fixed_module(*args, **kwargs)[source]

Try to create a fixed module from fixed dict. If the code is running in a trial, this method would succeed, and a concrete module instead of a mutable will be created. Raises no context error if the creation failed.

exception nni.retiarii.nn.pytorch.mutation_utils.NoContextError[source]

Exception raised when context is missing.

class nni.retiarii.utils.ContextStack(key, value)[source]

This is to maintain a globally-accessible context environment that is visible to everywhere.

Use with ContextStack(namespace, value): to initiate, and use get_current_context(namespace) to get the corresponding value in the namespace.

Note that this is not multi-processing safe. Also, the values will get cleared for a new process.

class nni.retiarii.utils.ModelNamespace(key='model')[source]

To create an individual namespace for models:

  1. to enable automatic numbering;

  2. to trace general information (like creation of hyper-parameters) of model.

A namespace is bounded to a key. Namespace bounded to different keys are completed isolated. Namespace can have sub-namespaces (with the same key). The numbering will be chained (e.g., model_1_4_2).

static current_context(key='model')[source]

Get the current context in key.

static next_label(key='model')[source]

Get the next label for API calls, with automatic numbering.

exception nni.retiarii.utils.NoContextError[source]

Exception raised when context is missing.

nni.retiarii.utils.original_state_dict_hooks(model)[source]

Use this patch if you want to save/load state dict in the original state dict hierarchy.

For example, when you already have a state dict for the base model / search space (which often happens when you have trained a supernet with one-shot strategies), the state dict isn’t organized in the same way as when a sub-model is sampled from the search space. This patch will help the modules in the sub-model find the corresponding module in the base model.

The code looks like,

with original_state_dict_hooks(model):
    model.load_state_dict(state_dict_from_supernet, strict=False)  # supernet has extra keys

Or vice-versa,

with original_state_dict_hooks(model):
    supernet_style_state_dict = model.state_dict()