Exploration Strategy

There are two types of model space exploration approach: Multi-trial strategy and One-shot strategy. When the model space has been constructed, users can use either exploration approach to explore the model space.

Here is the list of exploration strategies that NNI has supported.

Name

Category

Brief Description

Random

Multi-trial

Randomly sample an architecture each time

GridSearch

Multi-trial

Traverse the search space and try all possibilities

RegularizedEvolution

Multi-trial

Evolution algorithm for NAS. Reference

TPE

Multi-trial

Tree-structured Parzen Estimator (TPE). Reference

PolicyBasedRL

Multi-trial

Policy-based reinforcement learning, based on implementation of tianshou. Reference

DARTS

One-shot

Continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Reference

ENAS

One-shot

RL controller learns to generate the best network on a super-net. Reference

GumbelDARTS

One-shot

Choose the best block by using Gumbel Softmax random sampling and differentiable training. Reference

RandomOneShot

One-shot

Train a super-net with uniform path sampling. Reference

Proxyless

One-shot

A low-memory-consuming optimized version of differentiable architecture search. Reference

Multi-trial strategy

Multi-trial NAS means each sampled model from model space is trained independently. A typical multi-trial NAS is NASNet. In multi-trial NAS, users need model evaluator to evaluate the performance of each sampled model, and need an exploration strategy to sample models from a defined model space. Here, users could use NNI provided model evaluators or write their own model evalutor. They can simply choose a exploration strategy. Advanced users can also customize new exploration strategy.

To use an exploration strategy, users simply instantiate an exploration strategy and pass the instantiated object to RetiariiExperiment. Below is a simple example.

import nni.retiarii.strategy as strategy
exploration_strategy = strategy.Random(dedup=True)

Rather than using strategy.Random, users can choose one of the strategies from the table above.

One-shot strategy

One-shot NAS algorithms leverage weight sharing among models in neural architecture search space to train a supernet, and use this supernet to guide the selection of better models. This type of algorihtms greatly reduces computational resource compared to independently training each model from scratch (which we call “Multi-trial NAS”).

Starting from v2.8, the usage of one-shot strategies are much alike to multi-trial strategies. Users simply need to create a strategy and run RetiariiExperiment. Since one-shot strategies will manipulate the training recipe, to use a one-shot strategy, the evaluator needs to be one of the PyTorch-Lightning evaluators, either built-in or customized. Last but not least, don’t forget to set execution engine to oneshot. Example follows:

import nni.retiarii.strategy as strategy
import nni.retiarii.evaluator.pytorch.lightning as pl
evaluator = pl.Classification(
  # Need to use `pl.DataLoader` instead of `torch.utils.data.DataLoader` here,
  # or use `nni.trace` to wrap `torch.utils.data.DataLoader`.
  train_dataloaders=pl.DataLoader(train_dataset, batch_size=100),
  val_dataloaders=pl.DataLoader(test_dataset, batch_size=100),
  # Other keyword arguments passed to pytorch_lightning.Trainer.
  max_epochs=10,
  gpus=1,
)
exploration_strategy = strategy.DARTS()

exp_config.execution_engine = 'oneshot'

One-shot strategies only support a limited set of Mutation Pritimives, and does not support customizing mutators at all. See the reference for the detailed support list of each algorithm.

New in version 2.8: One-shot strategy is now compatible with Lightning accelerators. It means that, you can accelerate one-shot strategies on hardwares like multiple GPUs. To enable this feature, you only need to pass the keyword arguments which used to be set in pytorch_lightning.Trainer, to your evaluator. See this reference for more details.

One-shot strategy (legacy)

Warning

Deprecated since version 2.8: The following usages are deprecated and will be removed in future releases. If you intend to use them, the references can be found here.

The usage of one-shot NAS strategy is a little different from multi-trial strategy. One-shot strategy is implemented with a special type of objects named Trainer. Following the common practice of one-shot NAS, Trainer trains the super-net and searches for the optimal architecture in a single run. For example,

from nni.retiarii.oneshot.pytorch import DartsTrainer

trainer = DartsTrainer(
   model=model,
   loss=criterion,
   metrics=lambda output, target: accuracy(output, target, topk=(1,)),
   optimizer=optim,
   dataset=dataset_train,
   batch_size=32,
   log_frequency=50
)
trainer.fit()

One-shot strategy can be used without RetiariiExperiment. Thus, the trainer.fit() here runs the experiment locally.

After trainer.fit() completes, we can use trainer.export() to export the searched architecture (a dict of choices) to a file.

final_architecture = trainer.export()
print('Final architecture:', trainer.export())
json.dump(trainer.export(), open('checkpoint.json', 'w'))

Tip

The trained super-net (neither the weights or exported JSON) can’t be used directly. It’s only an intermediate result used for deriving the final architecture. The exported architecture (can be retrieved with nni.retiarii.fixed_arch()) needs to be retrained with a standard training recipe to get the final model.