NAS Reference¶
Contents
Mutables¶
-
class
nni.nas.pytorch.mutables.
Mutable
(key=None)[source]¶ Mutable is designed to function as a normal layer, with all necessary operators’ weights. States and weights of architectures should be included in mutator, instead of the layer itself.
Mutable has a key, which marks the identity of the mutable. This key can be used by users to share decisions among different mutables. In mutator’s implementation, mutators should use the key to distinguish different mutables. Mutables that share the same key should be “similar” to each other.
Currently the default scope for keys is global. By default, the keys uses a global counter from 1 to produce unique ids.
Parameters: key (str) – The key of mutable. Notes
The counter is program level, but mutables are model level. In case multiple models are defined, and you want to have counter starting from 1 in the second model, it’s recommended to assign keys manually instead of using automatic keys.
-
key
¶ Read-only property of key.
-
name
¶ After the search space is parsed, it will be the module name of the mutable.
-
-
class
nni.nas.pytorch.mutables.
LayerChoice
(op_candidates, reduction='sum', return_mask=False, key=None)[source]¶ Layer choice selects one of the
op_candidates
, then apply it on inputs and return results. In rare cases, it can also select zero or many.Layer choice does not allow itself to be nested.
Parameters: - op_candidates (list of nn.Module or OrderedDict) – A module list to be selected from.
- reduction (str) –
mean
,concat
,sum
ornone
. Policy if multiples are selected. Ifnone
, a list is returned.mean
returns the average.sum
returns the sum.concat
concatenate the list at dimension 1. - return_mask (bool) – If
return_mask
, return output tensor and a mask. Otherwise return tensor only. - key (str) – Key of the input choice.
-
length
¶ Deprecated. Number of ops to choose from.
len(layer_choice)
is recommended.Type: int
-
names
¶ Names of candidates.
Type: list of str
-
choices
¶ Deprecated. A list of all candidate modules in the layer choice module.
list(layer_choice)
is recommended, which will serve the same purpose.Type: list of Module
Notes
op_candidates
can be a list of modules or a ordered dict of named modules, for example,self.op_choice = LayerChoice(OrderedDict([ ("conv3x3", nn.Conv2d(3, 16, 128)), ("conv5x5", nn.Conv2d(5, 16, 128)), ("conv7x7", nn.Conv2d(7, 16, 128)) ]))
Elements in layer choice can be modified or deleted. Use
del self.op_choice["conv5x5"]
orself.op_choice[1] = nn.Conv3d(...)
. Adding more choices is not supported yet.
-
class
nni.nas.pytorch.mutables.
InputChoice
(n_candidates=None, choose_from=None, n_chosen=None, reduction='sum', return_mask=False, key=None)[source]¶ Input choice selects
n_chosen
inputs fromchoose_from
(containsn_candidates
keys). For beginners, usen_candidates
instead ofchoose_from
is a safe option. To get the most power out of it, you might want to know aboutchoose_from
.The keys in
choose_from
can be keys that appear in past mutables, orNO_KEY
if there are no suitable ones. The keys are designed to be the keys of the sources. To help mutators make better decisions, mutators might be interested in how the tensors to choose from come into place. For example, the tensor is the output of some operator, some node, some cell, or some module. If this operator happens to be a mutable (e.g.,LayerChoice
orInputChoice
), it has a key naturally that can be used as a source key. If it’s a module/submodule, it needs to be annotated with a key: that’s where aMutableScope
is needed.In the example below,
input_choice
is a 4-choose-any. The first 3 is semantically output of cell1, output of cell2, output of cell3 with respectively. Notice that an extra max pooling is followed by cell1, indicating x1 is not “actually” the direct output of cell1.class Cell(MutableScope): pass class Net(nn.Module): def __init__(self): self.cell1 = Cell("cell1") self.cell2 = Cell("cell2") self.op = LayerChoice([conv3x3(), conv5x5()], key="op") self.input_choice = InputChoice(choose_from=["cell1", "cell2", "op", InputChoice.NO_KEY]) def forward(self, x): x1 = max_pooling(self.cell1(x)) x2 = self.cell2(x) x3 = self.op(x) x4 = torch.zeros_like(x) return self.input_choice([x1, x2, x3, x4])
Parameters: - n_candidates (int) – Number of inputs to choose from.
- choose_from (list of str) – List of source keys to choose from. At least of one of
choose_from
andn_candidates
must be fulfilled. Ifn_candidates
has a value butchoose_from
is None, it will be automatically treated asn_candidates
number of empty string. - n_chosen (int) – Recommended inputs to choose. If None, mutator is instructed to select any.
- reduction (str) –
mean
,concat
,sum
ornone
. SeeLayerChoice
. - return_mask (bool) – If
return_mask
, return output tensor and a mask. Otherwise return tensor only. - key (str) – Key of the input choice.
-
forward
(optional_inputs)[source]¶ Forward method of LayerChoice.
Parameters: optional_inputs (list or dict) – Recommended to be a dict. As a dict, inputs will be converted to a list that follows the order of choose_from
in initialization. As a list, inputs must follow the semantic order that is the same aschoose_from
.Returns: Output and selection mask. If return_mask
isFalse
, only output is returned.Return type: tuple of tensors
-
class
nni.nas.pytorch.mutables.
MutableScope
(key)[source]¶ Mutable scope marks a subgraph/submodule to help mutators make better decisions.
If not annotated with mutable scope, search space will be flattened as a list. However, some mutators might need to leverage the concept of a “cell”. So if a module is defined as a mutable scope, everything in it will look like “sub-search-space” in the scope. Scopes can be nested.
There are two ways mutators can use mutable scope. One is to traverse the search space as a tree during initialization and reset. The other is to implement enter_mutable_scope and exit_mutable_scope. They are called before and after the forward method of the class inheriting mutable scope.
Mutable scopes are also mutables that are listed in the mutator.mutables (search space), but they are not supposed to appear in the dict of choices.
Parameters: key (str) – Key of mutable scope.
Mutators¶
-
class
nni.nas.pytorch.base_mutator.
BaseMutator
(model)[source]¶ A mutator is responsible for mutating a graph by obtaining the search space from the network and implementing callbacks that are called in
forward
in mutables.Parameters: model (nn.Module) – PyTorch model to apply mutator on. -
enter_mutable_scope
(mutable_scope)[source]¶ Callback when forward of a MutableScope is entered.
Parameters: mutable_scope (MutableScope) – The mutable scope that is entered.
-
exit_mutable_scope
(mutable_scope)[source]¶ Callback when forward of a MutableScope is exited.
Parameters: mutable_scope (MutableScope) – The mutable scope that is exited.
-
export
()[source]¶ Export the data of all decisions. This should output the decisions of all the mutables, so that the whole network can be fully determined with these decisions for further training from scratch.
Returns: Mappings from mutable keys to decisions. Return type: dict
-
mutables
¶ A generator of all modules inheriting
Mutable
. Modules are yielded in the order that they are defined in__init__
. For mutables with their keys appearing multiple times, only the first one will appear.
-
on_forward_input_choice
(mutable, tensor_list)[source]¶ Callbacks of forward in InputChoice.
Parameters: - mutable (InputChoice) – Mutable that is called.
- tensor_list (list of torch.Tensor) – The arguments mutable is called with.
Returns: Output tensor and mask.
Return type: tuple of torch.Tensor and torch.Tensor
-
on_forward_layer_choice
(mutable, *args, **kwargs)[source]¶ Callbacks of forward in LayerChoice.
Parameters: - mutable (LayerChoice) – Module whose forward is called.
- args (list of torch.Tensor) – The arguments of its forward function.
- kwargs (dict) – The keyword arguments of its forward function.
Returns: Output tensor and mask.
Return type: tuple of torch.Tensor and torch.Tensor
-
-
class
nni.nas.pytorch.mutator.
Mutator
(model)[source]¶ -
export
()[source]¶ Resample (for final) and return results.
Returns: A mapping from key of mutables to decisions. Return type: dict
-
graph
(inputs)[source]¶ Return model supernet graph.
Parameters: inputs (tuple of tensor) – Inputs that will be feeded into the network. Returns: Containing node
, in Tensorboard GraphDef format. Additional keymutable
is a map from key to list of modules.Return type: dict
-
on_forward_input_choice
(mutable, tensor_list)[source]¶ On default, this method retrieves the decision obtained previously, and select certain tensors. Then it will reduce the list of all tensor outputs with the policy specified in mutable.reduction.
Parameters: - mutable (InputChoice) – Input choice module.
- tensor_list (list of torch.Tensor) – Tensor list to apply the decision on.
Returns: Output and mask.
Return type: tuple of torch.Tensor and torch.Tensor
-
on_forward_layer_choice
(mutable, *args, **kwargs)[source]¶ On default, this method retrieves the decision obtained previously, and select certain operations. Only operations with non-zero weight will be executed. The results will be added to a list. Then it will reduce the list of all tensor outputs with the policy specified in mutable.reduction.
Parameters: - mutable (LayerChoice) – Layer choice module.
- args (list of torch.Tensor) – Inputs
- kwargs (dict) – Inputs
Returns: Output and mask.
Return type: tuple of torch.Tensor and torch.Tensor
-
reset
()[source]¶ Reset the mutator by call the sample_search to resample (for search). Stores the result in a local variable so that on_forward_layer_choice and on_forward_input_choice can use the decision directly.
-
sample_final
()[source]¶ Override to implement this method to iterate over mutables and make decisions that is final for export and retraining.
Returns: A mapping from key of mutables to decisions. Return type: dict
-
Random Mutator¶
-
class
nni.nas.pytorch.random.
RandomMutator
(model)[source]¶ Random mutator that samples a random candidate in the search space each time
reset()
. It uses random function in PyTorch, so users can set seed in PyTorch to ensure deterministic behavior.-
sample_final
()[source]¶ Same as
sample_search()
.
-
Utilities¶
-
class
nni.nas.pytorch.utils.
StructuredMutableTreeNode
(mutable)[source]¶ A structured representation of a search space. A search space comes with a root (with None stored in its mutable), and a bunch of children in its children. This tree can be seen as a “flattened” version of the module tree. Since nested mutable entity is not supported yet, the following must be true: each subtree corresponds to a
MutableScope
and each leaf corresponds to aMutable
(other thanMutableScope
).Parameters: mutable (nni.nas.pytorch.mutables.Mutable) – The mutable that current node is linked with. -
traverse
(order='pre', deduplicate=True, memo=None)[source]¶ Return a generator that generates a list of mutables in this tree.
Parameters: - order (str) – pre or post. If pre, current mutable is yield before children. Otherwise after.
- deduplicate (bool) – If true, mutables with the same key will not appear after the first appearance.
- memo (dict) – An auxiliary dict that memorize keys seen before, so that deduplication is possible.
Returns: Return type: generator of Mutable
-
Trainers¶
Trainer¶
-
class
nni.nas.pytorch.base_trainer.
BaseTrainer
[source]¶
-
class
nni.nas.pytorch.trainer.
Trainer
(model, mutator, loss, metrics, optimizer, num_epochs, dataset_train, dataset_valid, batch_size, workers, device, log_frequency, callbacks)[source]¶ A trainer with some helper functions implemented. To implement a new trainer, users need to implement
train_one_epoch()
,validate_one_epoch()
andcheckpoint()
.Parameters: - model (nn.Module) – Model with mutables.
- mutator (BaseMutator) – A mutator object that has been initialized with the model.
- loss (callable) – Called with logits and targets. Returns a loss tensor. See PyTorch loss functions for examples.
- metrics (callable) –
Called with logits and targets. Returns a dict that maps metrics keys to metrics data. For example,
def metrics_fn(output, target): return {"acc1": accuracy(output, target, topk=1), "acc5": accuracy(output, target, topk=5)}
- optimizer (Optimizer) – Optimizer that optimizes the model.
- num_epochs (int) – Number of epochs of training.
- dataset_train (torch.utils.data.Dataset) – Dataset of training. If not otherwise specified,
dataset_train
anddataset_valid
should be standard PyTorch Dataset. See torch.utils.data for examples. - dataset_valid (torch.utils.data.Dataset) – Dataset of validation/testing.
- batch_size (int) – Batch size.
- workers (int) – Number of workers used in data preprocessing.
- device (torch.device) – Device object. Either
torch.device("cuda")
ortorch.device("cpu")
. WhenNone
, trainer will automatic detects GPU and selects GPU first. - log_frequency (int) – Number of mini-batches to log metrics.
- callbacks (list of Callback) – Callbacks to plug into the trainer. See Callbacks.
-
enable_visualization
()[source]¶ Enable visualization. Write graph and training log to folder
logs/<timestamp>
.
-
export
(file)[source]¶ Call
mutator.export()
and dump the architecture tofile
.Parameters: file (str) – A file path. Expected to be a JSON.
-
train
(validate=True)[source]¶ Train
num_epochs
. Trigger callbacks at the start and the end of each epoch.Parameters: validate (bool) – If true
, will do validation every epoch.
Retrain¶
-
nni.nas.pytorch.fixed.
apply_fixed_architecture
(model, fixed_arc)[source]¶ Load architecture from fixed_arc and apply to model.
Parameters: - model (torch.nn.Module) – Model with mutables.
- fixed_arc (str or dict) – Path to the JSON that stores the architecture, or dict that stores the exported architecture.
Returns: Mutator that is responsible for fixes the graph.
Return type:
-
class
nni.nas.pytorch.fixed.
FixedArchitecture
(model, fixed_arc, strict=True)[source]¶ Fixed architecture mutator that always selects a certain graph.
Parameters: - model (nn.Module) – A mutable network.
- fixed_arc (dict) – Preloaded architecture object.
- strict (bool) – Force everything that appears in
fixed_arc
to be used at least once.
-
replace_layer_choice
(module=None, prefix='')[source]¶ Replace layer choices with selected candidates. It’s done with best effort. In case of weighted choices or multiple choices. if some of the choices on weighted with zero, delete them. If single choice, replace the module with a normal module.
Parameters: - module (nn.Module) – Module to be processed.
- prefix (str) – Module name under global namespace.
Distributed NAS¶
-
nni.nas.pytorch.classic_nas.
get_and_apply_next_architecture
(model)[source]¶ Wrapper of
ClassicMutator
to make it more meaningful, similar toget_next_parameter
for HPO.Tt will generate search space based on
model
. If envNNI_GEN_SEARCH_SPACE
exists, this is in dry run mode for generating search space for the experiment. If not, there are still two mode, one is nni experiment mode where users usennictl
to start an experiment. The other is standalone mode where users directly run the trial command, this mode chooses the first one(s) for each LayerChoice and InputChoice.Parameters: model (nn.Module) – User’s model with search space (e.g., LayerChoice, InputChoice) embedded in it.
-
class
nni.nas.pytorch.classic_nas.mutator.
ClassicMutator
(model)[source]¶ This mutator is to apply the architecture chosen from tuner. It implements the forward function of LayerChoice and InputChoice, to only activate the chosen ones.
Parameters: model (nn.Module) – User’s model with search space (e.g., LayerChoice, InputChoice) embedded in it. -
sample_search
()[source]¶ See
sample_final()
.
-
Callbacks¶
-
class
nni.nas.pytorch.callbacks.
Callback
[source]¶ Callback provides an easy way to react to events like begin/end of epochs.
-
build
(model, mutator, trainer)[source]¶ Callback needs to be built with model, mutator, trainer, to get updates from them.
Parameters: - model (nn.Module) – Model to be trained.
- mutator (nn.Module) – Mutator that mutates the model.
- trainer (BaseTrainer) – Trainer that is to call the callback.
-
-
class
nni.nas.pytorch.callbacks.
LRSchedulerCallback
(scheduler, mode='epoch')[source]¶ Calls scheduler on every epoch ends.
Parameters: scheduler (LRScheduler) – Scheduler to be called.
-
class
nni.nas.pytorch.callbacks.
ArchitectureCheckpoint
(checkpoint_dir)[source]¶ Calls
trainer.export()
on every epoch ends.Parameters: checkpoint_dir (str) – Location to save checkpoints.
Utilities¶
-
class
nni.nas.pytorch.utils.
AverageMeterGroup
[source]¶ Average meter group for multiple average meters.