Uncategorized Modules¶
Experiment¶
- class nni.retiarii.experiment.pytorch.RetiariiExeConfig(training_service_platform: Optional[str] = None, execution_engine: Union[str, nni.nas.experiment.config.engine_config.ExecutionEngineConfig] = 'py', **kwargs)[source]¶
- class nni.retiarii.experiment.pytorch.RetiariiExperiment(base_model=None, evaluator=None, applied_mutators=None, strategy=None, trainer=None)[source]¶
The entry for a NAS experiment. Users can use this class to start/stop or inspect an experiment, like exporting the results.
Experiment is a sub-class of
nni.experiment.Experiment
, there are many similarities such as configurable training service to distributed running the experiment on remote server. But unlikenni.experiment.Experiment
, RetiariiExperiment doesn’t support configure:trial_code_directory
, which can only be current working directory.search_space
, which is auto-generated in NAS.trial_command
, which must bepython -m nni.retiarii.trial_entry
to launch the modulized trial code.
RetiariiExperiment also doesn’t have tuner/assessor/advisor, because they are also implemented in strategy.
Also, unlike
nni.experiment.Experiment
which is bounded to a node server, RetiariiExperiment optionally starts a node server to schedule the trials, when the strategy is a multi-trial strategy. When the strategy is one-shot, the step of launching node server is omitted, and the experiment is run locally by default.Configurations of experiments, such as execution engine, number of GPUs allocated, should be put into a
RetiariiExeConfig
and used as an argument ofRetiariiExperiment.run()
.- Parameters:
base_model (nn.Module) – The model defining the search space / base skeleton without mutation. It should be wrapped by decorator
nni.retiarii.model_wrapper
.evaluator (nni.retiarii.Evaluator, default = None) – Evaluator for the experiment. If you are using a one-shot trainer, it should be placed here, although this usage is deprecated.
applied_mutators (list of nni.retiarii.Mutator, default = None) – Mutators os mutate the base model. If none, mutators are skipped. Note that when
base_model
uses inline mutations (e.g., LayerChoice),applied_mutators
must be empty / none.strategy (nni.retiarii.strategy.BaseStrategy, default = None) – Exploration strategy. Can be multi-trial or one-shot.
trainer (BaseOneShotTrainer) – Kept for compatibility purposes.
Examples
Multi-trial NAS:
>>> base_model = Net() >>> search_strategy = strategy.Random() >>> model_evaluator = FunctionalEvaluator(evaluate_model) >>> exp = RetiariiExperiment(base_model, model_evaluator, [], search_strategy) >>> exp_config = RetiariiExeConfig('local') >>> exp_config.trial_concurrency = 2 >>> exp_config.max_trial_number = 20 >>> exp_config.training_service.use_active_gpu = False >>> exp.run(exp_config, 8081)
One-shot NAS:
>>> base_model = Net() >>> search_strategy = strategy.DARTS() >>> evaluator = pl.Classification(train_dataloader=train_loader, val_dataloaders=valid_loader) >>> exp = RetiariiExperiment(base_model, evaluator, [], search_strategy) >>> exp_config = RetiariiExeConfig() >>> exp_config.execution_engine = 'oneshot' # must be set of one-shot strategy >>> exp.run(exp_config)
Export top models:
>>> for model_dict in exp.export_top_models(formatter='dict'): ... print(model_dict) >>> with nni.retarii.fixed_arch(model_dict): ... final_model = Net()
- export_top_models(top_k=1, optimize_mode='maximize', formatter='dict')[source]¶
Export several top performing models.
For one-shot algorithms, only top-1 is supported. For others,
optimize_mode
andformatter
are available for customization.The concrete behavior of export depends on each strategy. See the documentation of each strategy for detailed specifications.
- Parameters:
top_k (int) – How many models are intended to be exported.
optimize_mode (str) –
maximize
orminimize
. Not supported by one-shot algorithms.optimize_mode
is likely to be removed and defined in strategy in future.formatter (str) – Support
code
anddict
. Not supported by one-shot algorithms. Ifcode
, the python code of model will be returned. Ifdict
, the mutation history will be returned.
- static resume(experiment_id, port=8080, debug=False)[source]¶
Resume a stopped experiment.
- Parameters:
experiment_id (str) – The stopped experiment id.
port (int) – The port of web UI.
debug (bool) – Whether to start in debug mode.
- run(config=None, port=8080, debug=False)[source]¶
Run the experiment. This function will block until experiment finish or error.
NAS Benchmarks¶
NAS-Bench-101¶
- class nni.nas.benchmarks.nasbench101.Nb101IntermediateStats(*args, **kwargs)[source]¶
Intermediate statistics for NAS-Bench-101.
- trial¶
The exact trial where the intermediate result is produced.
- Type:
- current_epoch¶
Elapsed epochs when evaluation is done.
- Type:
int
- train_acc¶
Intermediate accuracy on training data, ranging from 0 to 100.
- Type:
float
- valid_acc¶
Intermediate accuracy on validation data, ranging from 0 to 100.
- Type:
float
- test_acc¶
Intermediate accuracy on test data, ranging from 0 to 100.
- Type:
float
- training_time¶
Time elapsed in seconds.
- Type:
float
- class nni.nas.benchmarks.nasbench101.Nb101TrialConfig(*args, **kwargs)[source]¶
Trial config for NAS-Bench-101.
- arch¶
A dict with keys
op1
,op2
, … andinput1
,input2
, … Vertices are enumerate from 0. Since node 0 is input node, it is skipped in this dict. Eachop
is one ofnni.nas.benchmark.nasbench101.CONV3X3_BN_RELU
,nni.nas.benchmark.nasbench101.CONV1X1_BN_RELU
, andnni.nas.benchmark.nasbench101.MAXPOOL3X3
. Eachinput
is a list of previous nodes. For exampleinput5
can be[0, 1, 3]
.- Type:
dict
- num_vertices¶
Number of vertices (nodes) in one cell. Should be less than or equal to 7 in default setup.
- Type:
int
- hash¶
Graph-invariant MD5 string for this architecture.
- Type:
str
- num_epochs¶
Number of epochs planned for this trial. Should be one of 4, 12, 36, 108 in default setup.
- Type:
int
- class nni.nas.benchmarks.nasbench101.Nb101TrialStats(*args, **kwargs)[source]¶
Computation statistics for NAS-Bench-101. Each corresponds to one trial. Each config has multiple trials with different random seeds, but unfortunately seed for each trial is unavailable. NAS-Bench-101 trains and evaluates on CIFAR-10 by default. The original training set is divided into 40k training images and 10k validation images, and the original validation set is used for test only.
- config¶
Setup for this trial data.
- Type:
- train_acc¶
Final accuracy on training data, ranging from 0 to 100.
- Type:
float
- valid_acc¶
Final accuracy on validation data, ranging from 0 to 100.
- Type:
float
- test_acc¶
Final accuracy on test data, ranging from 0 to 100.
- Type:
float
- parameters¶
Number of trainable parameters in million.
- Type:
float
- training_time¶
Duration of training in seconds.
- Type:
float
- nni.nas.benchmarks.nasbench101.query_nb101_trial_stats(arch, num_epochs, isomorphism=True, reduction=None, include_intermediates=False)[source]¶
Query trial stats of NAS-Bench-101 given conditions.
- Parameters:
arch (dict or None) – If a dict, it is in the format that is described in
nni.nas.benchmark.nasbench101.Nb101TrialConfig
. Only trial stats matched will be returned. If none, all architectures in the database will be matched.num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.
isomorphism (boolean) – Whether to match essentially-same architecture, i.e., architecture with the same graph-invariant hash value.
reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.
include_intermediates (boolean) – If true, intermediate results will be returned.
- Returns:
A generator of
nni.nas.benchmark.nasbench101.Nb101TrialStats
objects, where each of them has been converted into a dict.- Return type:
generator of dict
NAS-Bench-201¶
- class nni.nas.benchmarks.nasbench201.Nb201IntermediateStats(*args, **kwargs)[source]¶
Intermediate statistics for NAS-Bench-201.
- trial¶
Corresponding trial.
- Type:
- current_epoch¶
Elapsed epochs.
- Type:
int
- train_acc¶
Current accuracy on training data, ranging from 0 to 100.
- Type:
float
- valid_acc¶
Current accuracy on validation data, ranging from 0 to 100.
- Type:
float
- test_acc¶
Current accuracy on test data, ranging from 0 to 100.
- Type:
float
- ori_test_acc¶
Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.
- Type:
float
- train_loss¶
Current cross entropy loss on training data.
- Type:
float or None
- valid_loss¶
Current cross entropy loss on validation data.
- Type:
float or None
- test_loss¶
Current cross entropy loss on test data.
- Type:
float or None
- ori_test_loss¶
Current cross entropy loss on original validation set.
- Type:
float or None
- class nni.nas.benchmarks.nasbench201.Nb201TrialConfig(*args, **kwargs)[source]¶
Trial config for NAS-Bench-201.
- arch¶
A dict with keys
0_1
,0_2
,0_3
,1_2
,1_3
,2_3
, each of which is an operator chosen fromnni.nas.benchmark.nasbench201.NONE
,nni.nas.benchmark.nasbench201.SKIP_CONNECT
,nni.nas.benchmark.nasbench201.CONV_1X1
,nni.nas.benchmark.nasbench201.CONV_3X3
andnni.nas.benchmark.nasbench201.AVG_POOL_3X3
.- Type:
dict
- num_epochs¶
Number of epochs planned for this trial. Should be one of 12 and 200.
- Type:
int
- num_channels¶
Number of channels for initial convolution. 16 by default.
- Type:
int
- num_cells¶
Number of cells per stage. 5 by default.
- Type:
int
- dataset¶
Dataset used for training and evaluation. NAS-Bench-201 provides the following 4 options:
cifar10-valid
(training data is splited into 25k for training and 25k for validation, validation data is used for test),cifar10
(training data is used in training, validation data is splited into 5k for validation and 5k for testing),cifar100
(same protocol ascifar10
), andimagenet16-120
(a subset of 120 classes in ImageNet, downscaled to 16x16, using training data for training, 6k images from validation set for validation and the other 6k for testing).- Type:
str
- class nni.nas.benchmarks.nasbench201.Nb201TrialStats(*args, **kwargs)[source]¶
Computation statistics for NAS-Bench-201. Each corresponds to one trial.
- config¶
Setup for this trial data.
- Type:
- seed¶
Random seed selected, for reproduction.
- Type:
int
- train_acc¶
Final accuracy on training data, ranging from 0 to 100.
- Type:
float
- valid_acc¶
Final accuracy on validation data, ranging from 0 to 100.
- Type:
float
- test_acc¶
Final accuracy on test data, ranging from 0 to 100.
- Type:
float
- ori_test_acc¶
Test accuracy on original validation set (10k for CIFAR and 12k for Imagenet16-120), ranging from 0 to 100.
- Type:
float
- train_loss¶
Final cross entropy loss on training data. Note that loss could be NaN, in which case this attributed will be None.
- Type:
float or None
- valid_loss¶
Final cross entropy loss on validation data.
- Type:
float or None
- test_loss¶
Final cross entropy loss on test data.
- Type:
float or None
- ori_test_loss¶
Final cross entropy loss on original validation set.
- Type:
float or None
- parameters¶
Number of trainable parameters in million.
- Type:
float
- latency¶
Latency in seconds.
- Type:
float
- flops¶
FLOPs in million.
- Type:
float
- training_time¶
Duration of training in seconds.
- Type:
float
- valid_evaluation_time¶
Time elapsed to evaluate on validation set.
- Type:
float
- test_evaluation_time¶
Time elapsed to evaluate on test set.
- Type:
float
- ori_test_evaluation_time¶
Time elapsed to evaluate on original test set.
- Type:
float
- nni.nas.benchmarks.nasbench201.query_nb201_trial_stats(arch, num_epochs, dataset, reduction=None, include_intermediates=False)[source]¶
Query trial stats of NAS-Bench-201 given conditions.
- Parameters:
arch (dict or None) – If a dict, it is in the format that is described in
nni.nas.benchmark.nasbench201.Nb201TrialConfig
. Only trial stats matched will be returned. If none, all architectures in the database will be matched.num_epochs (int or None) – If int, matching results will be returned. Otherwise a wildcard.
dataset (str or None) – If specified, can be one of the dataset available in
nni.nas.benchmark.nasbench201.Nb201TrialConfig
. Otherwise a wildcard.reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.
include_intermediates (boolean) – If true, intermediate results will be returned.
- Returns:
A generator of
nni.nas.benchmark.nasbench201.Nb201TrialStats
objects, where each of them has been converted into a dict.- Return type:
generator of dict
NDS¶
- class nni.nas.benchmarks.nds.NdsIntermediateStats(*args, **kwargs)[source]¶
Intermediate statistics for NDS.
- trial¶
Corresponding trial.
- Type:
- current_epoch¶
Elapsed epochs.
- Type:
int
- train_loss¶
Current cross entropy loss on training data. Can be NaN (None).
- Type:
float or None
- train_acc¶
Current accuracy on training data, ranging from 0 to 100.
- Type:
float
- test_acc¶
Current accuracy on test data, ranging from 0 to 100.
- Type:
float
- class nni.nas.benchmarks.nds.NdsTrialConfig(*args, **kwargs)[source]¶
Trial config for NDS.
- model_family¶
Could be
nas_cell
,residual_bottleneck
,residual_basic
orvanilla
.- Type:
str
- model_spec¶
If
model_family
isnas_cell
, it containsnum_nodes_normal
,num_nodes_reduce
,depth
,width
,aux
anddrop_prob
. Ifmodel_family
isresidual_bottleneck
, it containsbot_muls
,ds
(depths),num_gs
(number of groups) andss
(strides). Ifmodel_family
isresidual_basic
orvanilla
, it containsds
,ss
andws
.- Type:
dict
- cell_spec¶
If
model_family
is notnas_cell
it will be an empty dict. Otherwise, it specifies<normal/reduce>_<i>_<op/input>_<x/y>
, where i ranges from 0 tonum_nodes_<normal/reduce> - 1
. If it is anop
, the value is chosen from the constants specified previously likenni.nas.benchmark.nds.CONV_1X1
. If it is i’sinput
, the value range from 0 toi + 1
, asnas_cell
uses previous two nodes as inputs, and node 0 is actually the second node. Refer to NASNet paper for details. Finally, another two key-value pairsnormal_concat
andreduce_concat
specify which nodes are eventually concatenated into output.- Type:
dict
- dataset¶
Dataset used. Could be
cifar10
orimagenet
.- Type:
str
- generator¶
Can be one of
random
which generates configurations at random, while keeping learning rate and weight decay fixed,fix_w_d
which further keepswidth
anddepth
fixed, only applicable fornas_cell
.tune_lr_wd
which further tunes learning rate and weight decay.- Type:
str
- proposer¶
Paper who has proposed the distribution for random sampling. Available proposers include
nasnet
,darts
,enas
,pnas
,amoeba
,vanilla
,resnext-a
,resnext-b
,resnet
,resnet-b
(ResNet with bottleneck). See NDS paper for details.- Type:
str
- base_lr¶
Initial learning rate.
- Type:
float
- weight_decay¶
L2 weight decay applied on weights.
- Type:
float
- num_epochs¶
Number of epochs scheduled, during which learning rate will decay to 0 following cosine annealing.
- Type:
int
- class nni.nas.benchmarks.nds.NdsTrialStats(*args, **kwargs)[source]¶
Computation statistics for NDS. Each corresponds to one trial.
- config¶
Corresponding config for trial.
- Type:
- seed¶
Random seed selected, for reproduction.
- Type:
int
- final_train_acc¶
Final accuracy on training data, ranging from 0 to 100.
- Type:
float
- final_train_loss¶
Final cross entropy loss on training data. Could be NaN (None).
- Type:
float or None
- final_test_acc¶
Final accuracy on test data, ranging from 0 to 100.
- Type:
float
- best_train_acc¶
Best accuracy on training data, ranging from 0 to 100.
- Type:
float
- best_train_loss¶
Best cross entropy loss on training data. Could be NaN (None).
- Type:
float or None
- best_test_acc¶
Best accuracy on test data, ranging from 0 to 100.
- Type:
float
- parameters¶
Number of trainable parameters in million.
- Type:
float
- flops¶
FLOPs in million.
- Type:
float
- iter_time¶
Seconds elapsed for each iteration.
- Type:
float
- nni.nas.benchmarks.nds.query_nds_trial_stats(model_family, proposer, generator, model_spec, cell_spec, dataset, num_epochs=None, reduction=None, include_intermediates=False)[source]¶
Query trial stats of NDS given conditions.
- Parameters:
model_family (str or None) – If str, can be one of the model families available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.proposer (str or None) – If str, can be one of the proposers available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.generator (str or None) – If str, can be one of the generators available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.model_spec (dict or None) – If specified, can be one of the model spec available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.cell_spec (dict or None) – If specified, can be one of the cell spec available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.dataset (str or None) – If str, can be one of the datasets available in
nni.nas.benchmark.nds.NdsTrialConfig
. Otherwise a wildcard.num_epochs (float or None) – If int, matching results will be returned. Otherwise a wildcard.
reduction (str or None) – If ‘none’ or None, all trial stats will be returned directly. If ‘mean’, fields in trial stats will be averaged given the same trial config.
include_intermediates (boolean) – If true, intermediate results will be returned.
- Returns:
A generator of
nni.nas.benchmark.nds.NdsTrialStats
objects, where each of them has been converted into a dict.- Return type:
generator of dict
Retrain (Architecture Evaluation)¶
- nni.retiarii.fixed_arch(fixed_arch, verbose=True)[source]¶
Load architecture from
fixed_arch
and apply to model. This should be used as a context manager. For example,with fixed_arch('/path/to/export.json'): model = Model(3, 224, 224)
- Parameters:
fixed_arc (str, Path or dict) – Path to the JSON that stores the architecture, or dict that stores the exported architecture.
verbose (bool) – Print log messages if set to True
- Returns:
Context manager that provides a fixed architecture when creates the model.
- Return type:
Utilities¶
- nni.retiarii.basic_unit(cls, basic_unit_tag=True)[source]¶
To wrap a module as a basic unit, is to make it a primitive and stop the engine from digging deeper into it.
basic_unit_tag
is true by default. If set to false, it will not be explicitly mark as a basic unit, and graph parser will continue to parse. Currently, this is to handle a special case innn.Sequential
.Although
basic_unit
callstrace
in its implementation, it is not for serialization. Rather, it is meant to capture the initialization arguments for mutation. Also, graph execution engine will stop digging into the inner modules when it reaches a module that is decorated withbasic_unit
.@basic_unit class PrimitiveOp(nn.Module): ...
- nni.retiarii.model_wrapper(cls)[source]¶
Wrap the base model (search space). For example,
@model_wrapper class MyModel(nn.Module): ...
The wrapper serves two purposes:
Capture the init parameters of python class so that it can be re-instantiated in another process.
Reset uid in namespace so that the auto label counting in each model stably starts from zero.
Currently, NNI might not complain in simple cases where
@model_wrapper
is actually not needed. But in future, we might enforce@model_wrapper
to be required for base model.
- class nni.retiarii.nn.pytorch.mutation_utils.Mutable[source]¶
This is just an implementation trick for now.
In future, this could be the base class for all PyTorch mutables including layer choice, input choice, etc. This is not considered as an interface, but rather as a base class consisting of commonly used class/instance methods. For API developers, it’s not recommended to use
isinstance(module, Mutable)
to check for mutable modules either, before the design is finalized.
- class nni.retiarii.utils.ContextStack(key, value)[source]¶
This is to maintain a globally-accessible context environment that is visible to everywhere.
Use
with ContextStack(namespace, value):
to initiate, and useget_current_context(namespace)
to get the corresponding value in the namespace.Note that this is not multi-processing safe. Also, the values will get cleared for a new process.
- class nni.retiarii.utils.ModelNamespace(key='model')[source]¶
To create an individual namespace for models:
to enable automatic numbering;
to trace general information (like creation of hyper-parameters) of model.
A namespace is bounded to a key. Namespace bounded to different keys are completed isolated. Namespace can have sub-namespaces (with the same key). The numbering will be chained (e.g.,
model_1_4_2
).
- nni.retiarii.utils.original_state_dict_hooks(model)[source]¶
Use this patch if you want to save/load state dict in the original state dict hierarchy.
For example, when you already have a state dict for the base model / search space (which often happens when you have trained a supernet with one-shot strategies), the state dict isn’t organized in the same way as when a sub-model is sampled from the search space. This patch will help the modules in the sub-model find the corresponding module in the base model.
The code looks like,
with original_state_dict_hooks(model): model.load_state_dict(state_dict_from_supernet, strict=False) # supernet has extra keys
Or vice-versa,
with original_state_dict_hooks(model): supernet_style_state_dict = model.state_dict()