NNI has supported many training services listed below. Users can go through each page to learning how to configure the corresponding training service. NNI has high extensibility by design, users can customize new training service for their special resource, platform or needs.

Training Service



The whole experiment runs on your dev machine (i.e., a single local machine)


The trials are dispatched to your configured SSH servers


Running trials on OpenPAI, a DNN model training platform based on Kubernetes


Running trials with Kubeflow, a DNN model training framework based on Kubernetes


Running trials on AdaptDL, an elastic DNN model training platform


Running trials with FrameworkController, a DNN model training framework on Kubernetes


Running trials on Azure Machine Learning (AML) cloud service


Running trials on PAI-DLC, which is deep learning containers based on Alibaba ACK


Support jointly using multiple above training services

Training Service Under Reuse Mode

Since NNI v2.0, there are two sets of training service implementations in NNI. The new one is called reuse mode. When reuse mode is enabled, a cluster, such as a remote machine or a computer instance on AML, will launch a long-running environment, so that NNI will submit trials to these environments iteratively, which saves the time to create new jobs. For instance, using OpenPAI training platform under reuse mode can avoid the overhead of pulling docker images, creating containers, and downloading data repeatedly.


In the reuse mode, users need to make sure each trial can run independently in the same job (e.g., avoid loading checkpoints from previous trials).