Run an Experiment on Hybrid Mode¶
Run NNI on hybrid mode means that NNI will run trials jobs in multiple kinds of training platforms. For example, NNI could submit trial jobs to remote machine and AML simultaneously.
NNI has supported local, remote, PAI, AML, Kubeflow, FrameworkController,for hybrid training service. Before starting an experiment using these mode, users should setup the corresponding environment for the platforms. More details about the environment setup could be found in the corresponding docs.
Run an experiment¶
examples/trials/mnist-tfv1 as an example. The NNI config YAML file’s content is like:
experimentName: MNIST searchSpaceFile: search_space.json trialCommand: python3 mnist.py trialCodeDirectory: . trialConcurrency: 2 trialGpuNumber: 0 maxExperimentDuration: 24h maxTrialNumber: 100 tuner: name: TPE classArgs: optimize_mode: maximize trainingService: - platform: remote machineList: - host: 127.0.0.1 user: bob password: bob - platform: local
To use hybrid training services, users should set training service configurations as a list in trainingService field. Currently, hybrid support setting local, remote, pai, aml, kubeflow and frameworkcontroller training services.