Strategy¶

Multi-trial Strategy¶

Random¶

GridSearch¶

RegularizedEvolution¶

TPE¶

PolicyBasedRL¶

One-shot Strategy¶

Note

The usage of one-shot has been refreshed in v2.8. Please see legacy one-shot trainers for the old-style one-shot strategies.

DARTS¶

ENAS¶

class nni.retiarii.oneshot.pytorch.enas.ReinforceController(fields, lstm_size=64, lstm_num_layers=1, tanh_constant=1.5, skip_target=0.4, temperature=None, entropy_reduction='sum')[source]¶

A controller that mutates the graph with RL.

Parameters:

fields (list of ReinforceField) – List of fields to choose.
lstm_size (int) – Controller LSTM hidden units.
lstm_num_layers (int) – Number of layers for stacked LSTM.
tanh_constant (float) – Logits will be equal to tanh_constant * tanh(logits). Don’t use tanh if this value is None.
skip_target (float) – Target probability that skipconnect (chosen by InputChoice) will appear. If the chosen number of inputs is away from the skip_connect, there will be a sample skip penalty which is a KL divergence added.
temperature (float) – Temperature constant that divides the logits.
entropy_reduction (str) – Can be one of sum and mean. How the entropy of multi-input-choice is reduced.

GumbelDARTS¶

RandomOneShot¶

Proxyless¶

Customization¶

Multi-trial¶

class nni.retiarii.Sampler[source]: Handles Mutator.choice() calls.

nni.retiarii.execution.budget_exhausted()[source]¶

nni.retiarii.execution.get_and_register_default_listener(engine)[source]¶

nni.retiarii.execution.get_execution_engine()[source]¶

nni.retiarii.execution.init_execution_engine(config, port, url_prefix)[source]¶

nni.retiarii.execution.is_stopped_exec(model)[source]¶

nni.retiarii.execution.list_models(*models)[source]¶

nni.retiarii.execution.query_available_resources()[source]¶

nni.retiarii.execution.set_execution_engine(engine)[source]¶

nni.retiarii.execution.submit_models(*models)[source]¶

nni.retiarii.execution.wait_models(*models)[source]¶

One-shot¶

base_lightning¶

class nni.retiarii.oneshot.pytorch.base_lightning.BaseOneShotLightningModule(model, mutation_hooks=None)[source]¶

The base class for all one-shot NAS modules.

In NNI, we try to separate the “search” part and “training” part in one-shot NAS. The “training” part is defined with evaluator interface (has to be lightning evaluator interface to work with oneshot). Since the lightning evaluator has already broken down the training into minimal building blocks, we can re-assemble them after combining them with the “search” part of a particular algorithm.

After the re-assembling, this module has defined all the search + training. The experiment can use a lightning trainer (which is another part in the evaluator) to train this module, so as to complete the search process.

Essential function such as preprocessing user’s model, redirecting lightning hooks for user’s model, configuring optimizers and exporting NAS result are implemented in this class.

nas_modules¶

Modules that have been mutated, which the search algorithms should care about.

Type:: list[BaseSuperNetModule]

model¶

PyTorch lightning module. A model space with training recipe defined (wrapped by LightningModule in evaluator).

Type:: pl.LightningModule

Parameters:

inner_module (pytorch_lightning.LightningModule) – It’s a LightningModule that defines computations, train/val loops, optimizers in a single class. When used in NNI, the inner_module is the combination of instances of evaluator + base model (to be precise, a base model wrapped with LightningModule in evaluator).
mutation_hooks (list[MutationHook]) –
Extra mutation hooks to support customized mutation on primitives other than built-ins.

Mutation hooks are callable that inputs an Module and returns a BaseSuperNetModule. They are invoked in traverse_and_mutate_submodules(), on each submodules. For each submodule, the hook list are invoked subsequently, the later hooks can see the result from previous hooks. The modules that are processed by mutation_hooks will be replaced by the returned module, stored in nas_modules, and be the focus of the NAS algorithm.

The hook list will be appended by default_mutation_hooks in each one-shot module.

To be more specific, the input arguments are four arguments:
1. a module that might be processed,
2. name of the module in its parent module,
3. a memo dict whose usage depends on the particular algorithm.
4. keyword arguments (configurations).
Note that the memo should be read/written by hooks. There won’t be any hooks called on root module.

The returned arguments can be also one of the three kinds:
1. tuple of: BaseSuperNetModule or None, and boolean,
2. boolean,
3. BaseSuperNetModule or None.
The boolean value is suppress indicates whether the following hooks should be called. When it’s true, it suppresses the subsequent hooks, and they will never be invoked. Without boolean value specified, it’s assumed to be false. If a none value appears on the place of BaseSuperNetModule, it means the hook suggests to keep the module unchanged, and nothing will happen.

An example of mutation hook is given in no_default_hook(). However it’s recommended to implement mutation hooks by deriving BaseSuperNetModule, and add its classmethod mutate to this list.

advance_lr_schedulers(batch_idx)[source]¶

Advance the learning rates, when manual optimization is turned on.

The full implementation is here. We only include a partial implementation here. Advanced features like Reduce-lr-on-plateau are not supported.

advance_optimization(loss, batch_idx, gradient_clip_val=None, gradient_clip_algorithm=None)[source]¶

Run the optimizer defined in evaluators, when manual optimization is turned on.

Call this method when the model should be optimized. To keep it as neat as possible, we only implement the basic zero_grad, backward, grad_clip, and step here. Many hooks and pre/post-processing are omitted. Inherit this method if you need more advanced behavior.

The full optimizer step could be found here. We only implement part of the optimizer loop here.

Parameters:: batch_idx (int) – The current batch index.

architecture_optimizers()[source]¶

Get the optimizers configured in configure_architecture_optimizers().

configure_architecture_optimizers()[source]¶

Hook kept for subclasses. A specific NAS method inheriting this base class should return its architecture optimizers here if architecture parameters are needed. Note that lr schedulers are not supported now for architecture_optimizers.

Return type:: Optimizers used by a specific NAS algorithm. Return None if no architecture optimizers are needed.

configure_optimizers()[source]¶

Transparently configure optimizers for the inner model, unless one-shot algorithm has its own optimizer (via configure_architecture_optimizers()), in which case, the optimizer will be appended to the list.

The return value is still one of the 6 types defined in PyTorch-Lightning.

default_mutation_hooks()[source]¶

Override this to define class-default mutation hooks.

export()[source]¶

Export the NAS result, ideally the best choice of each nas_modules. You may implement an export method for your customized nas_modules.

Returns:: Keys are names of nas_modules, and values are the choice indices of them.
Return type:: dict

export_probs()[source]¶

Export the probability of every choice in the search space got chosen.

Note

If such method of some modules is not implemented, they will be simply ignored.

Returns:: In most cases, keys are names of nas_modules suffixed with / and choice name. Values are the probability / logits depending on the implementation.
Return type:: dict

mutate_kwargs()[source]¶

Extra keyword arguments passed to mutation hooks. Usually algo-specific.

resample(memo=None)[source]¶

Trigger the resample for each nas_modules. Sometimes (e.g., in differentiable cases), it does nothing.

Parameters:: memo (dict[str, Any]) – Used to ensure the consistency of samples with the same label.
Returns:: Sampled architecture.
Return type:: dict

search_space_spec()[source]¶

Get the search space specification from nas_modules.

Returns:: Key is the name of the choice, value is the corresponding ParameterSpec.
Return type:: dict

class nni.retiarii.oneshot.pytorch.base_lightning.BaseSuperNetModule[source]¶

Mutated module in super-net. Usually, the feed-forward of the module itself is undefined. It has to be resampled with resample() so that a specific path is selected. (Sometimes, this is not required. For example, differentiable super-net.)

A super-net module usually corresponds to one sample. But two exceptions:

A module can have multiple parameter spec. For example, a convolution-2d can sample kernel size, channels at the same time.
Multiple modules can share one parameter spec. For example, multiple layer choices with the same label.

For value choice compositions, the parameter spec are bounded to the underlying (original) value choices, rather than their compositions.

export(memo)[source]¶

Export the final architecture within this module. It should have the same keys as search_space_spec().

Parameters:: memo (dict[str, Any]) – Use memo to avoid the same label gets exported multiple times.

export_probs(memo)[source]¶

Export the probability / logits of every choice got chosen.

Parameters:: memo (dict[str, Any]) – Use memo to avoid the same label gets exported multiple times.

classmethod mutate(module, name, memo, mutate_kwargs)[source]¶

This is a mutation hook that creates a BaseSuperNetModule. The method should be implemented in each specific super-net module, because they usually have specific rules about what kind of modules to operate on.

Parameters:

module (nn.Module) – The module to be mutated (replaced).
name (str) – Name of this module. With full prefix. For example, module1.block1.conv.
memo (dict) – Memo to enable sharing parameters among mutated modules. It should be read and written by mutate functions themselves.
mutate_kwargs (dict) – Algo-related hyper-parameters, and some auxiliary information.

Returns:

The mutation result, along with an optional boolean flag indicating whether to suppress follow-up mutation hooks. See BaseOneShotLightningModule for details.

Return type:

Union[BaseSuperNetModule, bool, tuple[BaseSuperNetModule, bool]]

resample(memo)[source]¶

Resample the super-net module.

Parameters:: memo (dict[str, Any]) – Used to ensure the consistency of samples with the same label.
Returns:: Sampled result. If nothing new is sampled, it should return an empty dict.
Return type:: dict

search_space_spec()[source]¶

Space specification (sample points). Mapping from spec name to ParameterSpec. The names in choices should be in the same format of export.

For example:

{"layer1": ParameterSpec(values=["conv", "pool"])}

nni.retiarii.oneshot.pytorch.base_lightning.no_default_hook(module, name, memo, mutate_kwargs)[source]¶

Add this hook at the end of your hook list to raise error for unsupported mutation primitives.

nni.retiarii.oneshot.pytorch.base_lightning.traverse_and_mutate_submodules(root_module, hooks, mutate_kwargs, topdown=True)[source]¶

Traverse the module-tree of root_module, and call hooks on every tree node.

Parameters:

root_module (nn.Module) – User-defined model space. Since this method is called in the __init__ of BaseOneShotLightningModule, it’s usually a pytorch_lightning.LightningModule. The mutation will be in-place on root_module.
hooks (list[MutationHook]) – List of mutation hooks. See BaseOneShotLightningModule for how to write hooks. When a hook returns an module, the module will be replaced (mutated) to the new module.
mutate_kwargs (dict) – Extra keyword arguments passed to hooks.
topdown (bool, default = False) – If topdown is true, hooks are first called, before traversing its sub-module (i.e., pre-order DFS). Otherwise, sub-modules are first traversed, before calling hooks on this node (i.e., post-order DFS).

Returns:

modules – The replace result.

Return type:

dict[str, nn.Module]

dataloader¶

class nni.retiarii.oneshot.pytorch.dataloader.ConcatLoader(loaders, mode='min_size')[source]¶

This loader is same as CombinedLoader in PyTorch-Lightning, but concatenate sub-loaders instead of loading them in parallel.

Parameters:

loaders (dict[str, Any]) –
For example,
```
{
    "train": DataLoader(train_dataset),
    "val": DataLoader(val_dataset)
}
```
In this example, the loader will first produce the batches from “train”, then “val”.
mode (str) – Only support “min_size” for now.

supermodule.differentiable¶

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.DifferentiableMixedCell(op_factory, num_nodes, num_ops_per_node, num_predecessors, preprocessor, postprocessor, concat_dim, memo, mutate_kwargs, label)[source]¶

Implementation of Cell under differentiable context.

Similar to PathSamplingCell, this cell only handles cells of specific kinds (e.g., with loose end).

An architecture parameter is created on each edge of the full-connected graph.

export(memo)[source]¶

Tricky export.

Reference: https://github.com/quark0/darts/blob/f276dd346a09ae3160f8e3aca5c7b193fda1da37/cnn/model_search.py#L135

export_probs(memo)[source]¶: When export probability, we follow the structure in arch alpha.

resample(memo)[source]¶: Differentiable doesn’t need to resample.

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.DifferentiableMixedInput(n_candidates, n_chosen, alpha, softmax, label)[source]¶

Mixed input. Forward returns a weighted sum of candidates. Implementation is very similar to DifferentiableMixedLayer.

Parameters:

n_candidates (int) – Expect number of input candidates.
n_chosen (int) – Expect numebr of inputs finally chosen.
alpha (Tensor) – Tensor that stores the “learnable” weights.
softmax (nn.Module) – Customizable softmax function. Usually nn.Softmax(-1).
label (str) – Name of the choice.

label¶

Name of the choice.

Type:: str

export(memo)[source]¶: Choose the operator with the top n_chosen logits.

forward(inputs)[source]¶: Forward takes a list of input candidates.

named_parameters(*args, **kwargs)[source]¶: Named parameters excluding architecture parameters.

parameters(*args, **kwargs)[source]¶: Parameters excluding architecture parameters.

reduction(items, weights)[source]¶

Override this for customized reduction.

resample(memo)[source]¶: Do nothing. Differentiable layer doesn’t need resample.

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.DifferentiableMixedLayer(paths, alpha, softmax, label)[source]¶

Mixed layer, in which fprop is decided by a weighted sum of several layers. Proposed in DARTS: Differentiable Architecture Search.

The weight alpha is usually learnable, and optimized on validation dataset.

Differentiable sampling layer requires all operators returning the same shape for one input, as all outputs will be weighted summed to get the final output.

Parameters:

paths (list[tuple[str, nn.Module]]) – Layers to choose from. Each is a tuple of name, and its module.
alpha (Tensor) – Tensor that stores the “learnable” weights.
softmax (nn.Module) – Customizable softmax function. Usually nn.Softmax(-1).
label (str) – Name of the choice.

op_names¶

Operator names.

Type:: str

label¶

Name of the choice.

Type:: str

export(memo)[source]¶: Choose the operator with the maximum logit.

forward(*args, **kwargs)[source]¶: The forward of mixed layer accepts same arguments as its sub-layer.

named_parameters(*args, **kwargs)[source]¶: Named parameters excluding architecture parameters.

parameters(*args, **kwargs)[source]¶: Parameters excluding architecture parameters.

reduction(items, weights)[source]¶

Override this for customized reduction.

resample(memo)[source]¶: Do nothing. Differentiable layer doesn’t need resample.

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.DifferentiableMixedRepeat(blocks, depth, softmax, memo)[source]¶

Implementaion of Repeat in a differentiable supernet. Result is a weighted sum of possible prefixes, sliced by possible depths.

If the output is not a single tensor, it will be summed at every independant dimension. See weighted_sum() for details.

export(memo)[source]¶: Choose argmax for each leaf value choice.

export_probs(memo)[source]¶: Export the weight for every leaf value choice.

reduction(items, weights, depths)[source]¶

Override this for customized reduction.

resample(memo)[source]¶: Do nothing.

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.GumbelSoftmax(dim=-1)[source]¶

Wrapper of F.gumbel_softmax. dim = -1 by default.

class nni.retiarii.oneshot.pytorch.supermodule.differentiable.MixedOpDifferentiablePolicy(operation, memo, mutate_kwargs)[source]¶

Implementes the differentiable sampling in mixed operation.

One mixed operation can have multiple value choices in its arguments. Thus the _arch_alpha here is a parameter dict, and named_parameters filters out multiple parameters with _arch_alpha as its prefix.

When this class is asked for forward_argument, it returns a distribution, i.e., a dict from int to float based on its weights.

All the parameters (_arch_alpha, parameters(), _softmax) are saved as attributes of operation, rather than self, because this class itself is not a nn.Module, and saved parameters here won’t be optimized.

export(operation, memo)[source]¶

Export is argmax for each leaf value choice.

export_probs(operation, memo)[source]¶

Export the weight for every leaf value choice.

resample(operation, memo)[source]¶

Differentiable. Do nothing in resample.

supermodule.sampling¶

class nni.retiarii.oneshot.pytorch.supermodule.sampling.MixedOpPathSamplingPolicy(operation, memo, mutate_kwargs)[source]¶

Implements the path sampling in mixed operation.

One mixed operation can have multiple value choices in its arguments. Each value choice can be further decomposed into “leaf value choices”. We sample the leaf nodes, and composits them into the values on arguments.

export(operation, memo)[source]¶

Export is also random for each leaf value choice.

resample(operation, memo)[source]¶

Random sample for each leaf value choice.

class nni.retiarii.oneshot.pytorch.supermodule.sampling.PathSamplingCell(op_factory, num_nodes, num_ops_per_node, num_predecessors, preprocessor, postprocessor, concat_dim, memo, mutate_kwargs, label)[source]¶

The implementation of super-net cell follows DARTS.

When factory_used is true, it reconstructs the cell for every possible combination of operation and input index, because for different input index, the cell factory could instantiate different operations (e.g., with different stride). On export, we first have best (operation, input) pairs, the select the best num_ops_per_node.

loose_end is not supported yet, because it will cause more problems (e.g., shape mismatch). We assumes loose_end to be all regardless of its configuration.

A supernet cell can’t slim its own weight to fit into a sub network, which is also a known issue.

export(memo)[source]¶: Randomly choose one to export.

classmethod mutate(module, name, memo, mutate_kwargs)[source]¶: Mutate only handles cells of specific configurations (e.g., with loose end). Fallback to the default mutate if the cell is not handled here.

resample(memo)[source]¶: Random choose one path if label is not found in memo.

class nni.retiarii.oneshot.pytorch.supermodule.sampling.PathSamplingInput(n_candidates, n_chosen, reduction_type, label)[source]¶

Mixed input. Take a list of tensor as input, select some of them and return the sum.

_sampled¶

Sampled input indices.

Type:: int or list of int

export(memo)[source]¶: Random choose one name if label isn’t found in memo.

reduction(items, sampled)[source]¶

Override this to implement customized reduction.

resample(memo)[source]¶: Random choose one path / multiple paths if label is not found in memo. If one path is selected, only one integer will be in self._sampled. If multiple paths are selected, a list will be in self._sampled.

class nni.retiarii.oneshot.pytorch.supermodule.sampling.PathSamplingLayer(paths, label)[source]¶

Mixed layer, in which fprop is decided by exactly one inner layer or sum of multiple (sampled) layers. If multiple modules are selected, the result will be summed and returned.

_sampled¶

Sampled module indices.

Type:: int or list of str

label¶

Name of the choice.

Type:: str

export(memo)[source]¶: Random choose one name if label isn’t found in memo.

reduction(items, sampled)[source]¶

Override this to implement customized reduction.

resample(memo)[source]¶: Random choose one path if label is not found in memo.

class nni.retiarii.oneshot.pytorch.supermodule.sampling.PathSamplingRepeat(blocks, depth)[source]¶

Implementaion of Repeat in a path-sampling supernet. Samples one / some of the prefixes of the repeated blocks.

_sampled¶

Sampled depth.

Type:: int or list of int

export(memo)[source]¶: Random choose one if every choice not in memo.

reduction(items, sampled)[source]¶

Override this to implement customized reduction.

resample(memo)[source]¶: Since depth is based on ValueChoice, we only need to randomly sample every leaf value choices.

supermodule.proxyless¶

class nni.retiarii.oneshot.pytorch.supermodule.proxyless.ProxylessMixedInput(n_candidates, n_chosen, alpha, softmax, label)[source]¶

Proxyless version of differentiable input choice. See ProxylessMixedLayer for implementation details.

forward(inputs)[source]¶: Choose one single input.

resample(memo)[source]¶: Sample one path based on alpha if label is not found in memo.

class nni.retiarii.oneshot.pytorch.supermodule.proxyless.ProxylessMixedLayer(paths, alpha, softmax, label)[source]¶

Proxyless version of differentiable mixed layer. It resamples a single-path every time, rather than go through the softmax.

forward(*args, **kwargs)[source]¶: Forward pass of one single path.

resample(memo)[source]¶: Sample one path based on alpha if label is not found in memo.

supermodule.operation¶

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedBatchNorm2d(module_kwargs)[source]¶

Mixed BatchNorm2d operation.

Supported arguments are:

num_features
eps (only supported in path sampling)
momentum (only supported in path sampling)

For path-sampling, prefix of weight, bias, running_mean and running_var are sliced. For weighted cases, the maximum num_features is used directly.

Momentum is required to be float. PyTorch BatchNorm supports a case where momentum can be none, which is not supported here.

bound_type¶: alias of BatchNorm2d

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedConv2d(module_kwargs)[source]¶

Mixed conv2d op.

Supported arguments are:

in_channels
out_channels
groups
stride (only supported in path sampling)
kernel_size
padding
dilation (only supported in path sampling)

padding will be the “max” padding in differentiable mode.

Mutable groups is NOT supported in most cases of differentiable mode. However, we do support one special case when the group number is proportional to in_channels and out_channels. This is often the case of depth-wise convolutions.

For channels, prefix will be sliced. For kernels, we take the small kernel from the center and round it to floor (left top). For example

max_kernel = 5*5, sampled_kernel = 3*3, then we take [1: 4]
max_kernel = 5*5, sampled_kernel = 2*2, then we take [1: 3]
□ □ □ □ □   □ □ □ □ □
□ ■ ■ ■ □   □ ■ ■ □ □
□ ■ ■ ■ □   □ ■ ■ □ □
□ ■ ■ ■ □   □ □ □ □ □
□ □ □ □ □   □ □ □ □ □

bound_type¶: alias of Conv2d

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedLayerNorm(module_kwargs)[source]¶

Mixed LayerNorm operation.

Supported arguments are:

normalized_shape
eps (only supported in path sampling)

For path-sampling, prefix of weight and bias are sliced. For weighted cases, the maximum normalized_shape is used directly.

eps is required to be float.

bound_type¶: alias of LayerNorm

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedLinear(module_kwargs)[source]¶

Mixed linear operation.

Supported arguments are:

in_features
out_features

Prefix of weight and bias will be sliced.

bound_type¶: alias of Linear

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedMultiHeadAttention(module_kwargs)[source]¶

Mixed multi-head attention.

Supported arguments are:

embed_dim
num_heads (only supported in path sampling)
kdim
vdim
dropout (only supported in path sampling)

At init, it constructs the largest possible Q, K, V dimension. At forward, it slices the prefix to weight matrices according to the sampled value. For in_proj_bias and in_proj_weight, three parts will be sliced and concatenated together: [0, embed_dim), [max_embed_dim, max_embed_dim + embed_dim), [max_embed_dim * 2, max_embed_dim * 2 + embed_dim).

Warning

All candidates of embed_dim should be divisible by all candidates of num_heads.

bound_type¶: alias of MultiheadAttention

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedOperation(module_kwargs)[source]¶

This is the base class for all mixed operations. It’s what you should inherit to support a new operation with ValueChoice.

It contains commonly used utilities that will ease the effort to write customized mixed oeprations, i.e., operations with ValueChoice in its arguments. To customize, please write your own mixed operation, and add the hook into mutation_hooks parameter when using the strategy.

By design, for a mixed operation to work in a specific algorithm, at least two classes are needed.

One class needs to inherit this class, to control operation-related behavior, such as how to initialize the operation such that the sampled operation can be its sub-operation.
The other one needs to inherit MixedOperationSamplingPolicy, which controls algo-related behavior, such as sampling.

The two classes are linked with sampling_policy attribute in MixedOperation, whose type is set via mixed_op_sampling in mutate_kwargs when MixedOperation.mutate() is called.

With this design, one mixed-operation (e.g., MixedConv2d) can work in multiple algorithms (e.g., both DARTS and ENAS), saving the engineering effort to rewrite all operations for each specific algo.

This class should also define a bound_type, to control the matching type in mutate, an argument_list, to control which arguments can be dynamically used in forward. This list will also be used in mutate for sanity check.

export(memo)[source]¶: Delegates to MixedOperationSamplingPolicy.export().

export_probs(memo)[source]¶: Delegates to MixedOperationSamplingPolicy.export_probs().

forward(*args, **kwargs)[source]¶: First get sampled arguments, then forward with the sampled arguments (by calling forward_with_args).

forward_argument(name)[source]¶

Get the argument used in forward. This if often related to algo. We redirect this to sampling policy.

forward_with_args(*args, **kwargs)[source]¶: To control real fprop. The accepted arguments are argument_list, appended by forward arguments in the bound_type.

classmethod mutate(module, name, memo, mutate_kwargs)[source]¶: Find value choice in module’s arguments and replace the whole module

resample(memo)[source]¶: Delegates to MixedOperationSamplingPolicy.resample().

slice_param(**kwargs)[source]¶: Slice the params and buffers for subnet forward and state dict. When there is a mapping=True in kwargs, the return result will be wrapped in dict.

super_init_argument(name, value_choice)[source]¶

Get the initialization argument when constructing super-kernel, i.e., calling super().__init__(). This is often related to specific operator, rather than algo.

For example:

def super_init_argument(self, name, value_choice):
    return max(value_choice.candidates)

class nni.retiarii.oneshot.pytorch.supermodule.operation.MixedOperationSamplingPolicy(operation, memo, mutate_kwargs)[source]¶

Algo-related part for mixed Operation.

MixedOperation delegates its resample and export to this policy (or its subclass), so that one Operation can be easily combined with different kinds of sampling.

One SamplingStrategy corresponds to one mixed operation.

export(operation, memo)[source]¶

The handler of MixedOperation.export().

export_probs(operation, memo)[source]¶

The handler of MixedOperation.export_probs().

forward_argument(operation, name)[source]¶

Computing the argument with name used in operation’s forward. Usually a value, or a distribution of value.

resample(operation, memo)[source]¶

The handler of MixedOperation.resample().