NAS Reference

Mutables

class nni.nas.pytorch.mutables.Mutable(key=None)[source]

Mutable is designed to function as a normal layer, with all necessary operators’ weights. States and weights of architectures should be included in mutator, instead of the layer itself.

Mutable has a key, which marks the identity of the mutable. This key can be used by users to share decisions among different mutables. In mutator’s implementation, mutators should use the key to distinguish different mutables. Mutables that share the same key should be “similar” to each other.

Currently the default scope for keys is global. By default, the keys uses a global counter from 1 to produce unique ids.

Parameters:key (str) – The key of mutable.

Notes

The counter is program level, but mutables are model level. In case multiple models are defined, and you want to have counter starting from 1 in the second model, it’s recommended to assign keys manually instead of using automatic keys.

key

Read-only property of key.

name

After the search space is parsed, it will be the module name of the mutable.

class nni.nas.pytorch.mutables.LayerChoice(op_candidates, reduction='sum', return_mask=False, key=None)[source]

Layer choice selects one of the op_candidates, then apply it on inputs and return results. In rare cases, it can also select zero or many.

Layer choice does not allow itself to be nested.

Parameters:
  • op_candidates (list of nn.Module or OrderedDict) – A module list to be selected from.
  • reduction (str) – mean, concat, sum or none. Policy if multiples are selected. If none, a list is returned. mean returns the average. sum returns the sum. concat concatenate the list at dimension 1.
  • return_mask (bool) – If return_mask, return output tensor and a mask. Otherwise return tensor only.
  • key (str) – Key of the input choice.
length

Deprecated. Number of ops to choose from. len(layer_choice) is recommended.

Type:int
names

Names of candidates.

Type:list of str
choices

Deprecated. A list of all candidate modules in the layer choice module. list(layer_choice) is recommended, which will serve the same purpose.

Type:list of Module

Notes

op_candidates can be a list of modules or a ordered dict of named modules, for example,

self.op_choice = LayerChoice(OrderedDict([
    ("conv3x3", nn.Conv2d(3, 16, 128)),
    ("conv5x5", nn.Conv2d(5, 16, 128)),
    ("conv7x7", nn.Conv2d(7, 16, 128))
]))

Elements in layer choice can be modified or deleted. Use del self.op_choice["conv5x5"] or self.op_choice[1] = nn.Conv3d(...). Adding more choices is not supported yet.

forward(*args, **kwargs)[source]
Returns:Output and selection mask. If return_mask is False, only output is returned.
Return type:tuple of tensors
class nni.nas.pytorch.mutables.InputChoice(n_candidates=None, choose_from=None, n_chosen=None, reduction='sum', return_mask=False, key=None)[source]

Input choice selects n_chosen inputs from choose_from (contains n_candidates keys). For beginners, use n_candidates instead of choose_from is a safe option. To get the most power out of it, you might want to know about choose_from.

The keys in choose_from can be keys that appear in past mutables, or NO_KEY if there are no suitable ones. The keys are designed to be the keys of the sources. To help mutators make better decisions, mutators might be interested in how the tensors to choose from come into place. For example, the tensor is the output of some operator, some node, some cell, or some module. If this operator happens to be a mutable (e.g., LayerChoice or InputChoice), it has a key naturally that can be used as a source key. If it’s a module/submodule, it needs to be annotated with a key: that’s where a MutableScope is needed.

In the example below, input_choice is a 4-choose-any. The first 3 is semantically output of cell1, output of cell2, output of cell3 with respectively. Notice that an extra max pooling is followed by cell1, indicating x1 is not “actually” the direct output of cell1.

class Cell(MutableScope):
    pass

class Net(nn.Module):
    def __init__(self):
        self.cell1 = Cell("cell1")
        self.cell2 = Cell("cell2")
        self.op = LayerChoice([conv3x3(), conv5x5()], key="op")
        self.input_choice = InputChoice(choose_from=["cell1", "cell2", "op", InputChoice.NO_KEY])

    def forward(self, x):
        x1 = max_pooling(self.cell1(x))
        x2 = self.cell2(x)
        x3 = self.op(x)
        x4 = torch.zeros_like(x)
        return self.input_choice([x1, x2, x3, x4])
Parameters:
  • n_candidates (int) – Number of inputs to choose from.
  • choose_from (list of str) – List of source keys to choose from. At least of one of choose_from and n_candidates must be fulfilled. If n_candidates has a value but choose_from is None, it will be automatically treated as n_candidates number of empty string.
  • n_chosen (int) – Recommended inputs to choose. If None, mutator is instructed to select any.
  • reduction (str) – mean, concat, sum or none. See LayerChoice.
  • return_mask (bool) – If return_mask, return output tensor and a mask. Otherwise return tensor only.
  • key (str) – Key of the input choice.
forward(optional_inputs)[source]

Forward method of LayerChoice.

Parameters:optional_inputs (list or dict) – Recommended to be a dict. As a dict, inputs will be converted to a list that follows the order of choose_from in initialization. As a list, inputs must follow the semantic order that is the same as choose_from.
Returns:Output and selection mask. If return_mask is False, only output is returned.
Return type:tuple of tensors
class nni.nas.pytorch.mutables.MutableScope(key)[source]

Mutable scope marks a subgraph/submodule to help mutators make better decisions.

If not annotated with mutable scope, search space will be flattened as a list. However, some mutators might need to leverage the concept of a “cell”. So if a module is defined as a mutable scope, everything in it will look like “sub-search-space” in the scope. Scopes can be nested.

There are two ways mutators can use mutable scope. One is to traverse the search space as a tree during initialization and reset. The other is to implement enter_mutable_scope and exit_mutable_scope. They are called before and after the forward method of the class inheriting mutable scope.

Mutable scopes are also mutables that are listed in the mutator.mutables (search space), but they are not supposed to appear in the dict of choices.

Parameters:key (str) – Key of mutable scope.

Utilities

nni.nas.pytorch.utils.global_mutable_counting()[source]

A program level counter starting from 1.

Mutators

class nni.nas.pytorch.base_mutator.BaseMutator(model)[source]

A mutator is responsible for mutating a graph by obtaining the search space from the network and implementing callbacks that are called in forward in mutables.

Parameters:model (nn.Module) – PyTorch model to apply mutator on.
enter_mutable_scope(mutable_scope)[source]

Callback when forward of a MutableScope is entered.

Parameters:mutable_scope (MutableScope) – The mutable scope that is entered.
exit_mutable_scope(mutable_scope)[source]

Callback when forward of a MutableScope is exited.

Parameters:mutable_scope (MutableScope) – The mutable scope that is exited.
export()[source]

Export the data of all decisions. This should output the decisions of all the mutables, so that the whole network can be fully determined with these decisions for further training from scratch.

Returns:Mappings from mutable keys to decisions.
Return type:dict
forward(*inputs)[source]

Warning

Don’t call forward of a mutator.

mutables

A generator of all modules inheriting Mutable. Modules are yielded in the order that they are defined in __init__. For mutables with their keys appearing multiple times, only the first one will appear.

on_forward_input_choice(mutable, tensor_list)[source]

Callbacks of forward in InputChoice.

Parameters:
  • mutable (InputChoice) – Mutable that is called.
  • tensor_list (list of torch.Tensor) – The arguments mutable is called with.
Returns:

Output tensor and mask.

Return type:

tuple of torch.Tensor and torch.Tensor

on_forward_layer_choice(mutable, *args, **kwargs)[source]

Callbacks of forward in LayerChoice.

Parameters:
  • mutable (LayerChoice) – Module whose forward is called.
  • args (list of torch.Tensor) – The arguments of its forward function.
  • kwargs (dict) – The keyword arguments of its forward function.
Returns:

Output tensor and mask.

Return type:

tuple of torch.Tensor and torch.Tensor

class nni.nas.pytorch.mutator.Mutator(model)[source]
export()[source]

Resample (for final) and return results.

Returns:A mapping from key of mutables to decisions.
Return type:dict
graph(inputs)[source]

Return model supernet graph.

Parameters:inputs (tuple of tensor) – Inputs that will be feeded into the network.
Returns:Containing node, in Tensorboard GraphDef format. Additional key mutable is a map from key to list of modules.
Return type:dict
on_forward_input_choice(mutable, tensor_list)[source]

On default, this method retrieves the decision obtained previously, and select certain tensors. Then it will reduce the list of all tensor outputs with the policy specified in mutable.reduction.

Parameters:
  • mutable (InputChoice) – Input choice module.
  • tensor_list (list of torch.Tensor) – Tensor list to apply the decision on.
Returns:

Output and mask.

Return type:

tuple of torch.Tensor and torch.Tensor

on_forward_layer_choice(mutable, *args, **kwargs)[source]

On default, this method retrieves the decision obtained previously, and select certain operations. Only operations with non-zero weight will be executed. The results will be added to a list. Then it will reduce the list of all tensor outputs with the policy specified in mutable.reduction.

Parameters:
  • mutable (LayerChoice) – Layer choice module.
  • args (list of torch.Tensor) – Inputs
  • kwargs (dict) – Inputs
Returns:

Output and mask.

Return type:

tuple of torch.Tensor and torch.Tensor

reset()[source]

Reset the mutator by call the sample_search to resample (for search). Stores the result in a local variable so that on_forward_layer_choice and on_forward_input_choice can use the decision directly.

sample_final()[source]

Override to implement this method to iterate over mutables and make decisions that is final for export and retraining.

Returns:A mapping from key of mutables to decisions.
Return type:dict

Override to implement this method to iterate over mutables and make decisions.

Returns:A mapping from key of mutables to decisions.
Return type:dict
status()[source]

Return current selection status of mutator.

Returns:A mapping from key of mutables to decisions. All weights (boolean type and float type) are converted into real number values. Numpy arrays and tensors are converted into list.
Return type:dict

Random Mutator

class nni.nas.pytorch.random.RandomMutator(model)[source]

Random mutator that samples a random candidate in the search space each time reset(). It uses random function in PyTorch, so users can set seed in PyTorch to ensure deterministic behavior.

sample_final()[source]

Same as sample_search().

Sample a random candidate.

Utilities

class nni.nas.pytorch.utils.StructuredMutableTreeNode(mutable)[source]

A structured representation of a search space. A search space comes with a root (with None stored in its mutable), and a bunch of children in its children. This tree can be seen as a “flattened” version of the module tree. Since nested mutable entity is not supported yet, the following must be true: each subtree corresponds to a MutableScope and each leaf corresponds to a Mutable (other than MutableScope).

Parameters:mutable (nni.nas.pytorch.mutables.Mutable) – The mutable that current node is linked with.
add_child(mutable)[source]

Add a tree node to the children list of current node.

traverse(order='pre', deduplicate=True, memo=None)[source]

Return a generator that generates a list of mutables in this tree.

Parameters:
  • order (str) – pre or post. If pre, current mutable is yield before children. Otherwise after.
  • deduplicate (bool) – If true, mutables with the same key will not appear after the first appearance.
  • memo (dict) – An auxiliary dict that memorize keys seen before, so that deduplication is possible.
Returns:

Return type:

generator of Mutable

type()[source]

Return the type of mutable content.

Trainers

Trainer

class nni.nas.pytorch.base_trainer.BaseTrainer[source]
checkpoint()[source]

Override to dump a checkpoint.

export(file)[source]

Override the method to export to file.

Parameters:file (str) – File path to export to.
train()[source]

Override the method to train.

validate()[source]

Override the method to validate.

class nni.nas.pytorch.trainer.Trainer(model, mutator, loss, metrics, optimizer, num_epochs, dataset_train, dataset_valid, batch_size, workers, device, log_frequency, callbacks)[source]

A trainer with some helper functions implemented. To implement a new trainer, users need to implement train_one_epoch(), validate_one_epoch() and checkpoint().

Parameters:
  • model (nn.Module) – Model with mutables.
  • mutator (BaseMutator) – A mutator object that has been initialized with the model.
  • loss (callable) – Called with logits and targets. Returns a loss tensor. See PyTorch loss functions for examples.
  • metrics (callable) –

    Called with logits and targets. Returns a dict that maps metrics keys to metrics data. For example,

    def metrics_fn(output, target):
        return {"acc1": accuracy(output, target, topk=1), "acc5": accuracy(output, target, topk=5)}
    
  • optimizer (Optimizer) – Optimizer that optimizes the model.
  • num_epochs (int) – Number of epochs of training.
  • dataset_train (torch.utils.data.Dataset) – Dataset of training. If not otherwise specified, dataset_train and dataset_valid should be standard PyTorch Dataset. See torch.utils.data for examples.
  • dataset_valid (torch.utils.data.Dataset) – Dataset of validation/testing.
  • batch_size (int) – Batch size.
  • workers (int) – Number of workers used in data preprocessing.
  • device (torch.device) – Device object. Either torch.device("cuda") or torch.device("cpu"). When None, trainer will automatic detects GPU and selects GPU first.
  • log_frequency (int) – Number of mini-batches to log metrics.
  • callbacks (list of Callback) – Callbacks to plug into the trainer. See Callbacks.
checkpoint()[source]

Return trainer checkpoint.

enable_visualization()[source]

Enable visualization. Write graph and training log to folder logs/<timestamp>.

export(file)[source]

Call mutator.export() and dump the architecture to file.

Parameters:file (str) – A file path. Expected to be a JSON.
train(validate=True)[source]

Train num_epochs. Trigger callbacks at the start and the end of each epoch.

Parameters:validate (bool) – If true, will do validation every epoch.
train_one_epoch(epoch)[source]

Train one epoch.

Parameters:epoch (int) – Epoch number starting from 0.
validate()[source]

Do one validation.

validate_one_epoch(epoch)[source]

Validate one epoch.

Parameters:epoch (int) – Epoch number starting from 0.

Retrain

nni.nas.pytorch.fixed.apply_fixed_architecture(model, fixed_arc)[source]

Load architecture from fixed_arc and apply to model.

Parameters:
  • model (torch.nn.Module) – Model with mutables.
  • fixed_arc (str or dict) – Path to the JSON that stores the architecture, or dict that stores the exported architecture.
Returns:

Mutator that is responsible for fixes the graph.

Return type:

FixedArchitecture

class nni.nas.pytorch.fixed.FixedArchitecture(model, fixed_arc, strict=True)[source]

Fixed architecture mutator that always selects a certain graph.

Parameters:
  • model (nn.Module) – A mutable network.
  • fixed_arc (dict) – Preloaded architecture object.
  • strict (bool) – Force everything that appears in fixed_arc to be used at least once.
replace_layer_choice(module=None, prefix='')[source]

Replace layer choices with selected candidates. It’s done with best effort. In case of weighted choices or multiple choices. if some of the choices on weighted with zero, delete them. If single choice, replace the module with a normal module.

Parameters:
  • module (nn.Module) – Module to be processed.
  • prefix (str) – Module name under global namespace.
sample_final()[source]

Always returns the fixed architecture.

Always returns the fixed architecture.

Distributed NAS

nni.nas.pytorch.classic_nas.get_and_apply_next_architecture(model)[source]

Wrapper of ClassicMutator to make it more meaningful, similar to get_next_parameter for HPO.

Tt will generate search space based on model. If env NNI_GEN_SEARCH_SPACE exists, this is in dry run mode for generating search space for the experiment. If not, there are still two mode, one is nni experiment mode where users use nnictl to start an experiment. The other is standalone mode where users directly run the trial command, this mode chooses the first one(s) for each LayerChoice and InputChoice.

Parameters:model (nn.Module) – User’s model with search space (e.g., LayerChoice, InputChoice) embedded in it.
class nni.nas.pytorch.classic_nas.mutator.ClassicMutator(model)[source]

This mutator is to apply the architecture chosen from tuner. It implements the forward function of LayerChoice and InputChoice, to only activate the chosen ones.

Parameters:model (nn.Module) – User’s model with search space (e.g., LayerChoice, InputChoice) embedded in it.
sample_final()[source]

Convert the chosen arch and apply it on model.

See sample_final().

Callbacks

class nni.nas.pytorch.callbacks.Callback[source]

Callback provides an easy way to react to events like begin/end of epochs.

build(model, mutator, trainer)[source]

Callback needs to be built with model, mutator, trainer, to get updates from them.

Parameters:
  • model (nn.Module) – Model to be trained.
  • mutator (nn.Module) – Mutator that mutates the model.
  • trainer (BaseTrainer) – Trainer that is to call the callback.
on_epoch_begin(epoch)[source]

Implement this to do something at the begin of epoch.

Parameters:epoch (int) – Epoch number, starting from 0.
on_epoch_end(epoch)[source]

Implement this to do something at the end of epoch.

Parameters:epoch (int) – Epoch number, starting from 0.
class nni.nas.pytorch.callbacks.LRSchedulerCallback(scheduler, mode='epoch')[source]

Calls scheduler on every epoch ends.

Parameters:scheduler (LRScheduler) – Scheduler to be called.
on_epoch_end(epoch)[source]

Call self.scheduler.step() on epoch end.

class nni.nas.pytorch.callbacks.ArchitectureCheckpoint(checkpoint_dir)[source]

Calls trainer.export() on every epoch ends.

Parameters:checkpoint_dir (str) – Location to save checkpoints.
on_epoch_end(epoch)[source]

Dump to /checkpoint_dir/epoch_{number}.json on epoch end.

class nni.nas.pytorch.callbacks.ModelCheckpoint(checkpoint_dir)[source]

Calls trainer.export() on every epoch ends.

Parameters:checkpoint_dir (str) – Location to save checkpoints.
on_epoch_end(epoch)[source]

Dump to /checkpoint_dir/epoch_{number}.pth.tar on every epoch end. DataParallel object will have their inside modules exported.

Utilities

class nni.nas.pytorch.utils.AverageMeterGroup[source]

Average meter group for multiple average meters.

summary()[source]

Return a summary string of group data.

update(data)[source]

Update the meter group with a dict of metrics. Non-exist average meters will be automatically created.

class nni.nas.pytorch.utils.AverageMeter(name, fmt=':f')[source]

Computes and stores the average and current value.

Parameters:
  • name (str) – Name to display.
  • fmt (str) – Format string to print the values.
reset()[source]

Reset the meter.

update(val, n=1)[source]

Update with value and weight.

Parameters:
  • val (float or int) – The new value to be accounted in.
  • n (int) – The weight of the new value.
nni.nas.pytorch.utils.to_device(obj, device)[source]

Move a tensor, tuple, list, or dict onto device.