Skip to content
Neural Network Intelligence logo
Neural Network Intelligence Framework Related
Type to start searching
    GitHub
    • Neural Network Intelligence 
    • Python API Reference 
    • Compression API Reference 
    • Framework Related
    GitHub
    • 概述
    • 开始使用
    • 安装
    • 快速入门
    • 用户指南
    •  超参调优
      • 概述
      •  教程
        • PyTorch
        • TensorFlow
        • HPO 教程(PyTorch 版本)
        • 将 PyTorch 官方教程移植到NNI
        • HPO Quickstart with TensorFlow
        • Port TensorFlow Quickstart to NNI
      • 搜索空间
      • Tuners
      • Assessors
      •  高级用法
        • Command Line Tool Example
        • Implement Custom Tuners and Assessors
        • Install Custom or 3rd-party Tuners and Assessors
        • Tuner Benchmark
        • Tuner Benchmark Example Statistics
    •  架构搜索
      •  神经架构搜索
        • 快速入门
        • 构建搜索空间
        • 探索策略
        • 评估器
        •  高级用法
          • Execution Engines
          • Hardware-aware NAS
          • Construct Space with Mutator
          • Customize Exploration Strategy
          • Serialization
          •  NAS Benchmark
            • Overview
            • Examples
      •  Tutorials
        • Hello NAS!
        • Search in DARTS
      • Construct Model Space
      • Model Space Hub
      • Exploration Strategy
      • Model Evaluator
      •  Advanced Usage
        • Execution Engines
        • Hardware-aware NAS
        • Construct Space with Mutator
        • Customize Exploration Strategy
        • Serialization
        •  NAS Benchmark
          • Overview
          • Examples
    •  模型压缩
      • Overview
      •  Pruning
        • Overview
        • Quickstart
        • Pruner
        • Speedup
        •  Best Practices
          • Pruning Transformer
      •  Quantization
        • Overview
        • Quickstart
        • Quantizer
        • SpeedUp
      • Config Specification
      • Evaluator
      •  Advanced Usage
        • Customize Basic Pruner
        • Customize Quantizer
        • Customize Scheduled Pruning Process
        • Utilities
    •  特征工程
      • Overview
      • GradientFeatureSelector
      • GBDTSelector
    •  实验管理
      • Overview
      •  Training Service
        • Overview
        • Local
        • Remote
        • OpenPAI
        • Kubeflow
        • AdaptDL
        • FrameworkController
        • AML
        • PAI-DLC
        • Hybrid
        • Customize a Training Service
        • Shared Storage
      •  Web Portal
        • Experiment Web Portal
        • Visualize with TensorBoard
      • Experiment Management
    • 参考
    •  Python API
      • Hyperparameter Optimization
      •  Neural Architecture Search
        • Search Space
        • Strategy
        • Evaluator
        • Others
      •  Model Compression
        • Pruner
        • Quantizer
        • Pruning Speedup
        • Quantization Speedup
        • Evaluator
        • Compression Utilities
        • Framework Related
          • Framework Related
            • Pruner
              • Pruner
                • Pruner.compress()
                • Pruner.export_model()
                • Pruner.get_modules_wrapper()
                • Pruner.get_origin2wrapped_parameter_name_map()
                • Pruner.load_masks()
                • Pruner.show_pruned_weights()
            • PrunerModuleWrapper
              • PrunerModuleWrapper
            • BasicPruner
              • BasicPruner
                • BasicPruner.compress()
                • BasicPruner.reset_tools()
            • DataCollector
              • DataCollector
                • DataCollector.collect()
                • DataCollector.reset()
            • MetricsCalculator
              • MetricsCalculator
                • MetricsCalculator.calculate_metrics()
            • SparsityAllocator
              • SparsityAllocator
                • SparsityAllocator.common_target_masks_generation()
                • SparsityAllocator.generate_sparsity()
                • SparsityAllocator.special_target_masks_generation()
            • BasePruningScheduler
              • BasePruningScheduler
                • BasePruningScheduler.compress()
                • BasePruningScheduler.generate_task()
                • BasePruningScheduler.get_best_result()
                • BasePruningScheduler.pruning_one_step()
                • BasePruningScheduler.record_task_result()
            • TaskGenerator
              • TaskGenerator
                • TaskGenerator.get_best_result()
                • TaskGenerator.next()
                • TaskGenerator.receive_task_result()
            • Quantizer
              • Quantizer
                • Quantizer.export_model()
                • Quantizer.export_model_save()
                • Quantizer.find_conv_bn_patterns()
                • Quantizer.fold_bn()
                • Quantizer.load_calibration_config()
                • Quantizer.quantize_input()
                • Quantizer.quantize_output()
                • Quantizer.quantize_weight()
                • Quantizer.record_shape()
            • QuantizerModuleWrapper
              • QuantizerModuleWrapper
            • QuantGrad
              • QuantGrad
                • QuantGrad.get_bits_length()
                • QuantGrad.quant_backward()
      • Experiment
      • Others
    • 实验配置
    • nnictl 命令
    • 杂项
    • 示例
    •  社区分享
      • Overview
      •  Automatic Model Tuning (HPO/NAS)
        • Tuning SVD automatically
        • EfficientNet on NNI
        • Automatic Model Architecture Search for Reading Comprehension
        • Parallelizing Optimization for TPE
      •  Automatic System Tuning (AutoSys)
        • Tuning SPTAG (Space Partition Tree And Graph) automatically
        • Tuning the performance of RocksDB
        • Tuning Tensor Operators automatically
      •  Model Compression
        • Knowledge distillation with NNI model compression
      •  Feature Engineering
        • NNI review article from Zhihu: - By Garvin Li
      •  Performance measurement, comparison and analysis
        • Neural Architecture Search Comparison
        • Hyper-parameter Tuning Algorithm Comparsion
        • Model Compression Algorithm Comparsion
      • Use NNI on Google Colab
      • nnSpider Emoticons
    • 研究发布
    • 源码安装
    • 贡献指南
    • 版本说明
    • Framework Related
      • Pruner
        • Pruner
          • Pruner.compress()
          • Pruner.export_model()
          • Pruner.get_modules_wrapper()
          • Pruner.get_origin2wrapped_parameter_name_map()
          • Pruner.load_masks()
          • Pruner.show_pruned_weights()
      • PrunerModuleWrapper
        • PrunerModuleWrapper
      • BasicPruner
        • BasicPruner
          • BasicPruner.compress()
          • BasicPruner.reset_tools()
      • DataCollector
        • DataCollector
          • DataCollector.collect()
          • DataCollector.reset()
      • MetricsCalculator
        • MetricsCalculator
          • MetricsCalculator.calculate_metrics()
      • SparsityAllocator
        • SparsityAllocator
          • SparsityAllocator.common_target_masks_generation()
          • SparsityAllocator.generate_sparsity()
          • SparsityAllocator.special_target_masks_generation()
      • BasePruningScheduler
        • BasePruningScheduler
          • BasePruningScheduler.compress()
          • BasePruningScheduler.generate_task()
          • BasePruningScheduler.get_best_result()
          • BasePruningScheduler.pruning_one_step()
          • BasePruningScheduler.record_task_result()
      • TaskGenerator
        • TaskGenerator
          • TaskGenerator.get_best_result()
          • TaskGenerator.next()
          • TaskGenerator.receive_task_result()
      • Quantizer
        • Quantizer
          • Quantizer.export_model()
          • Quantizer.export_model_save()
          • Quantizer.find_conv_bn_patterns()
          • Quantizer.fold_bn()
          • Quantizer.load_calibration_config()
          • Quantizer.quantize_input()
          • Quantizer.quantize_output()
          • Quantizer.quantize_weight()
          • Quantizer.record_shape()
      • QuantizerModuleWrapper
        • QuantizerModuleWrapper
      • QuantGrad
        • QuantGrad
          • QuantGrad.get_bits_length()
          • QuantGrad.quant_backward()

    Framework Related¶

    Pruner¶

    class nni.algorithms.compression.v2.pytorch.base.Pruner(model, config_list)[源代码]¶

    The abstract class for pruning algorithm. Inherit this class and implement the _reset_tools to customize a pruner.

    compress()[源代码]¶
    返回:

    Return the wrapped model and mask.

    返回类型:

    Tuple[Module, Dict]

    export_model(model_path, mask_path=None)[源代码]¶

    Export pruned model weights, masks and onnx model(optional)

    参数:
    • model_path (str) -- Path to save pruned model state_dict. The weight and bias have already multiplied the masks.

    • mask_path (Optional[str]) -- Path to save mask dict.

    get_modules_wrapper()[源代码]¶
    返回:

    An ordered dict, key is the name of the module, value is the wrapper of the module.

    返回类型:

    OrderedDict[str, PrunerModuleWrapper]

    get_origin2wrapped_parameter_name_map()[源代码]¶

    Get the name mapping of parameters from original model to wrapped model.

    返回:

    Return a dict {original_model_parameter_name: wrapped_model_parameter_name}

    返回类型:

    Dict[str, str]

    load_masks(masks)[源代码]¶

    Load an exist masks on the wrapper. You can train the model with an exist masks after load the masks.

    参数:

    masks (Dict[str, Dict[str, Tensor]]) -- The masks dict with format {'op_name': {'weight': mask, 'bias': mask}}.

    show_pruned_weights(dim=0)[源代码]¶

    Log the simulated prune sparsity.

    参数:

    dim (int) -- The pruned dim.

    PrunerModuleWrapper¶

    class nni.algorithms.compression.v2.pytorch.base.PrunerModuleWrapper(module, module_name, config)[源代码]¶

    Wrap a module to enable data parallel, forward method customization and buffer registeration.

    参数:
    • module (Module) -- The module user wants to compress.

    • config (Dict) -- The configurations that users specify for compression.

    • module_name (str) -- The name of the module to compress, wrapper module shares same name.

    BasicPruner¶

    class nni.algorithms.compression.v2.pytorch.pruning.basic_pruner.BasicPruner(model, config_list)[源代码]¶
    compress()[源代码]¶

    Used to generate the mask. Pruning process is divided in three stages. self.data_collector collect the data used to calculate the specify metric. self.metrics_calculator calculate the metric and self.sparsity_allocator generate the mask depend on the metric.

    返回:

    Return the wrapped model and mask.

    返回类型:

    Tuple[Module, Dict]

    reset_tools()[源代码]¶

    This function is used to reset self.data_collector, self.metrics_calculator and self.sparsity_allocator. The subclass needs to implement this function to complete the pruning process. See compress() to understand how NNI use these three part to generate mask for the bound model.

    DataCollector¶

    class nni.algorithms.compression.v2.pytorch.pruning.tools.DataCollector(compressor)[源代码]¶

    An abstract class for collect the data needed by the compressor.

    参数:

    compressor (Pruner) -- The compressor binded with this DataCollector.

    collect()[源代码]¶

    Collect the compressor needed data, i.e., module weight, the output of activation function.

    返回:

    Usually has format like {module_name: tensor_type_data}.

    返回类型:

    Dict

    reset(*args, **kwargs)[源代码]¶

    Reset the DataCollector.

    MetricsCalculator¶

    class nni.algorithms.compression.v2.pytorch.pruning.tools.MetricsCalculator(scalers=None)[源代码]¶

    An abstract class for calculate a kind of metrics of the given data.

    参数:

    scalers (Dict[str, Dict[str, Scaling]] | Scaling | None) -- Scaler is used to scale the metrics' size. It scaling metric to the same size as the shrinked mask in the sparsity allocator. If you want to use different scalers for different pruning targets in different modules, please use a dict {module_name: {target_name: scaler}}. If allocator meets an unspecified module name, it will try to use scalers['_default'][target_name] to scale its mask. If allocator meets an unspecified target name, it will try to use scalers[module_name]['_default'] to scale its mask. Passing in a scaler instead of a dict of scalers will be treated as passed in {'_default': {'_default': scalers}}. Passing in None means no need to scale.

    calculate_metrics(data)[源代码]¶
    参数:

    data (Dict) -- A dict handle the data used to calculate metrics. Usually has format like {module_name: tensor_type_data}.

    返回:

    The key is the layer_name, value is the metric. Note that the metric has the same size with the data size on dim.

    返回类型:

    Dict[str, Tensor]

    SparsityAllocator¶

    class nni.algorithms.compression.v2.pytorch.pruning.tools.SparsityAllocator(pruner, scalers=None, continuous_mask=True)[源代码]¶

    A base class for allocating mask based on metrics.

    参数:
    • pruner (Pruner) -- The pruner that binded with this SparsityAllocator.

    • scalers (Dict[str, Dict[str, Scaling]] | Scaling | None) -- Scaler is used to scale the masks' size. It shrinks the mask of the same size as the pruning target to the same size as the metric, or expands the mask of the same size as the metric to the same size as the pruning target. If you want to use different scalers for different pruning targets in different modules, please use a dict {module_name: {target_name: scaler}}. If allocator meets an unspecified module name, it will try to use scalers['_default'][target_name] to scale its mask. If allocator meets an unspecified target name, it will try to use scalers[module_name]['_default'] to scale its mask. Passing in a scaler instead of a dict of scalers will be treated as passed in {'_default': {'_default': scalers}}. Passing in None means no need to scale.

    • continuous_mask (bool) -- If set True, the part that has been masked will be masked first. If set False, the part that has been masked may be unmasked due to the increase of its corresponding metric.

    common_target_masks_generation(metrics)[源代码]¶

    Generate masks for metrics-dependent targets.

    参数:

    metrics (Dict[str, Dict[str, Tensor]]) -- The format is {module_name: {target_name: target_metric}}. The metric of usually has the same size with shrinked mask.

    返回:

    The format is {module_name: {target_name: mask}}. Return the masks of the same size as its target.

    返回类型:

    Dict[str, Dict[str, Tensor]]

    generate_sparsity(metrics)[源代码]¶

    The main function of SparsityAllocator, generate a set of masks based on the given metrics.

    参数:

    metrics (Dict) -- A metric dict with format {module_name: weight_metric}

    返回:

    The masks format is {module_name: {target_name: mask}}.

    返回类型:

    Dict[str, Dict[str, Tensor]]

    special_target_masks_generation(masks)[源代码]¶

    Some pruning targets' mask generation depends on other targets, i.e., bias mask depends on weight mask. This function is used to generate these masks, and it be called at the end of generate_sparsity.

    参数:

    masks (Dict[str, Dict[str, Tensor]]) -- The format is {module_name: {target_name: mask}}. It is usually the return value of common_target_masks_generation.

    BasePruningScheduler¶

    class nni.algorithms.compression.v2.pytorch.base.BasePruningScheduler[源代码]¶
    compress()[源代码]¶

    The pruning schedule main loop.

    generate_task()[源代码]¶
    返回:

    Return the next pruning task.

    返回类型:

    Optional[Task]

    get_best_result()[源代码]¶
    返回:

    Return the task result that has the best performance, inculde task id, the compact model, the masks on the compact model, score and config list used in this task.

    返回类型:

    Tuple[int, Module, Dict[str, Dict[str, Tensor]], float, List[Dict]]

    pruning_one_step(task)[源代码]¶

    Pruning the model defined in task.

    参数:

    task (Task) -- The pruning task in this step.

    返回:

    Return the result of the task in this step.

    返回类型:

    TaskResult

    record_task_result(task_result)[源代码]¶
    参数:

    task_result (TaskResult) -- The result of the task

    TaskGenerator¶

    class nni.algorithms.compression.v2.pytorch.pruning.tools.TaskGenerator(origin_model, origin_masks={}, origin_config_list=[], log_dir='.', keep_intermediate_result=False, best_result_mode='maximize')[源代码]¶

    This class used to generate config list for pruner in each iteration.

    参数:
    • origin_model (Optional[Module]) -- The origin unwrapped pytorch model to be pruned.

    • origin_masks (Optional[Dict[str, Dict[str, Tensor]]]) -- The pre masks on the origin model. This mask maybe user-defined or maybe generate by previous pruning.

    • origin_config_list (Optional[List[Dict]]) -- The origin config list provided by the user. Note that this config_list is directly config the origin model. This means the sparsity provided by the origin_masks should also be recorded in the origin_config_list.

    • log_dir (Union[str, Path]) -- The log directory use to saving the task generator log.

    • keep_intermediate_result (bool) -- If keeping the intermediate result, including intermediate model and masks during each iteration.

    • best_result_mode (Literal['latest', 'maximize', 'minimize']) --

      The way to decide which one is the best result. Three modes are supported. If the task results don't contain scores (task_result.score is None), it will fall back to latest.

      1. latest: The newest received result is the best result.

      2. maximize: The one with largest task result score is the best result.

      3. minimize: The one with smallest task result score is the best result.

    get_best_result()[源代码]¶
    返回:

    If self._best_task_id is not None, return best task id, best compact model, masks on the compact model, score, config list used in this task.

    返回类型:

    Optional[Tuple[int, Module, Dict[str, Dict[str, Tensor]], float, List[Dict]]]

    next()[源代码]¶
    返回:

    Return the next task from pending tasks.

    返回类型:

    Optional[Task]

    receive_task_result(task_result)[源代码]¶
    参数:

    task_result (TaskResult) -- The result of the task.

    Quantizer¶

    class nni.compression.pytorch.compressor.Quantizer(model, config_list, optimizer=None, dummy_input=None)[源代码]¶

    Base quantizer for pytorch quantizer

    export_model(model_path, calibration_path=None, onnx_path=None, input_shape=None, device=None)[源代码]¶

    Export quantized model weights and calibration parameters

    参数:
    • model_path (str) -- path to save quantized model weight

    • calibration_path (str) -- (optional) path to save quantize parameters after calibration

    • onnx_path (str) -- (optional) path to save onnx model

    • input_shape (list or tuple) -- input shape to onnx model

    • device (torch.device) -- device of the model, used to place the dummy input tensor for exporting onnx file. the tensor is placed on cpu if `device` is None

    返回类型:

    Dict

    export_model_save(model, model_path, calibration_config=None, calibration_path=None, onnx_path=None, input_shape=None, device=None)[源代码]¶

    This method helps save pytorch model, calibration config, onnx model in quantizer.

    参数:
    • model (pytorch model) -- pytorch model to be saved

    • model_path (str) -- path to save pytorch

    • calibration_config (dict) -- (optional) config of calibration parameters

    • calibration_path (str) -- (optional) path to save quantize parameters after calibration

    • onnx_path (str) -- (optional) path to save onnx model

    • input_shape (list or tuple) -- input shape to onnx model

    • device (torch.device) -- device of the model, used to place the dummy input tensor for exporting onnx file. the tensor is placed on cpu if `device` is None

    find_conv_bn_patterns(model, dummy_input)[源代码]¶

    Find all Conv-BN patterns, used for batch normalization folding

    参数:
    • model (torch.nn.Module) -- model to be analyzed.

    • dummy_input (tupel of torch.tensor) -- inputs to the model, used for generating the torchscript

    fold_bn(*inputs, wrapper)[源代码]¶

    Simulate batch normalization folding in the training graph. Folded weight and bias are returned for the following operations.

    参数:
    • inputs (tuple of torch.Tensor) -- inputs for the module

    • wrapper (QuantizerModuleWrapper) -- the wrapper for origin module

    返回类型:

    Tuple of torch.Tensor

    load_calibration_config(calibration_config)[源代码]¶

    This function aims to help quantizer set quantization parameters by loading from a calibration_config which is exported by other quantizer or itself. The main usage of this function is helping quantize aware training quantizer set appropriate initial parameters so that the training process will be much more flexible and converges quickly. What's more, it can also enable quantizer resume quantization model by loading parameters from config.

    参数:

    calibration_config (dict) -- dict which saves quantization parameters, quantizer can export itself calibration config. eg, calibration_config = quantizer.export_model(model_path, calibration_path)

    quantize_input(inputs, wrapper, **kwargs)[源代码]¶

    quantize should overload this method to quantize input. This method is effectively hooked to forward() of the model.

    参数:
    • inputs (Tensor) -- inputs that needs to be quantized

    • wrapper (QuantizerModuleWrapper) -- the wrapper for origin module

    quantize_output(output, wrapper, **kwargs)[源代码]¶

    quantize should overload this method to quantize output. This method is effectively hooked to forward() of the model.

    参数:
    • output (Tensor) -- output that needs to be quantized

    • wrapper (QuantizerModuleWrapper) -- the wrapper for origin module

    quantize_weight(wrapper, **kwargs)[源代码]¶

    quantize should overload this method to quantize weight. This method is effectively hooked to forward() of the model.

    参数:

    wrapper (QuantizerModuleWrapper) -- the wrapper for origin module

    record_shape(model, dummy_input)[源代码]¶

    Record input/output's shapes of each module to be quantized

    参数:
    • model (torch.nn.Module) -- model to be recorded.

    • dummy_input (tupel of torch.tensor) -- inputs to the model.

    QuantizerModuleWrapper¶

    class nni.compression.pytorch.compressor.QuantizerModuleWrapper(module, module_name, module_type, config, quantizer, bn_module=None)[源代码]¶

    QuantGrad¶

    class nni.compression.pytorch.compressor.QuantGrad(*args, **kwargs)[源代码]¶

    Base class for overriding backward function of quantization operation.

    classmethod get_bits_length(config, quant_type)[源代码]¶

    Get bits for quantize config

    参数:
    • config (Dict) -- the configuration for quantization

    • quant_type (str) -- quant type

    返回:

    n-bits for quantization configuration

    返回类型:

    int

    static quant_backward(tensor, grad_output, quant_type, scale, zero_point, qmin, qmax)[源代码]¶

    This method should be overrided by subclass to provide customized backward function, default implementation is Straight-Through Estimator

    参数:
    • tensor (Tensor) -- input of quantization operation

    • grad_output (Tensor) -- gradient of the output of quantization operation

    • scale (Tensor) -- the type of quantization, it can be QuantType.INPUT, QuantType.WEIGHT, QuantType.OUTPUT, you can define different behavior for different types.

    • zero_point (Tensor) -- zero_point for quantizing tensor

    • qmin (Tensor) -- quant_min for quantizing tensor

    • qmax (Tensor) -- quant_max for quantizng tensor

    返回:

    gradient of the input of quantization operation

    返回类型:

    tensor

    Previous Compression Utilities
    Next Experiment API Reference
    © Copyright 2022, Microsoft.
    Created using Sphinx 5.3.0. and Material for Sphinx