Compression Utilities

SensitivityAnalysis

class nni.compression.pytorch.utils.SensitivityAnalysis(model, val_func, sparsities=None, prune_type='l1', early_stop_mode=None, early_stop_value=None)[source]

Perform sensitivity analysis for this model.

Parameters
  • model (torch.nn.Module) – the model to perform sensitivity analysis

  • val_func (function) – validation function for the model. Due to different models may need different dataset/criterion , therefore the user need to cover this part by themselves. In the val_func, the model should be tested on the validation dateset, and the validation accuracy/loss should be returned as the output of val_func. There are no restrictions on the input parameters of the val_function. User can use the val_args, val_kwargs parameters in analysis to pass all the parameters that val_func needed.

  • sparsities (list) – The sparsity list provided by users. This parameter is set when the user only wants to test some specific sparsities. In the sparsity list, each element is a sparsity value which means how much weight the pruner should prune. Take [0.25, 0.5, 0.75] for an example, the SensitivityAnalysis will prune 25% 50% 75% weights gradually for each layer.

  • prune_type (str) – The pruner type used to prune the conv layers, default is ‘l1’, and ‘l2’, ‘fine-grained’ is also supported.

  • early_stop_mode (str) –

    If this flag is set, the sensitivity analysis for a conv layer will early stop when the validation metric( for example, accurracy/loss) has alreay meet the threshold. We support four different early stop modes: minimize, maximize, dropped, raised. The default value is None, which means the analysis won’t stop until all given sparsities are tested. This option should be used with early_stop_value together.

    minimize: The analysis stops when the validation metric return by the val_func lower than early_stop_value. maximize: The analysis stops when the validation metric return by the val_func larger than early_stop_value. dropped: The analysis stops when the validation metric has dropped by early_stop_value. raised: The analysis stops when the validation metric has raised by early_stop_value.

  • early_stop_value (float) – This value is used as the threshold for different earlystop modes. This value is effective only when the early_stop_mode is set.

analysis(val_args=None, val_kwargs=None, specified_layers=None)[source]

This function analyze the sensitivity to pruning for each conv layer in the target model. If start and end are not set, we analyze all the conv layers by default. Users can specify several layers to analyze or parallelize the analysis process easily through the start and end parameter.

Parameters
  • val_args (list) – args for the val_function

  • val_kwargs (dict) – kwargs for the val_funtion

  • specified_layers (list) – list of layer names to analyze sensitivity. If this variable is set, then only analyze the conv layers that specified in the list. User can also use this option to parallelize the sensitivity analysis easily.

Returns

sensitivities – dict object that stores the trajectory of the accuracy/loss when the prune ratio changes

Return type

dict

export(filepath)[source]

Export the results of the sensitivity analysis to a csv file. The firstline of the csv file describe the content structure. The first line is constructed by ‘layername’ and sparsity list. Each line below records the validation metric returned by val_func when this layer is under different sparsities. Note that, due to the early_stop option, some layers may not have the metrics under all sparsities.

layername, 0.25, 0.5, 0.75 conv1, 0.6, 0.55 conv2, 0.61, 0.57, 0.56

Parameters

filepath (str) – Path of the output file

load_state_dict(state_dict)[source]

Update the weight of the model

update_already_pruned(layername, ratio)[source]

Set the already pruned ratio for the target layer.

ChannelDependency

class nni.compression.pytorch.utils.ChannelDependency(model, dummy_input, traced_model=None, prune_type='Filter')[source]

This model analyze the channel dependencies between the conv layers in a model.

Parameters
  • model (torch.nn.Module) – The model to be analyzed.

  • data (torch.Tensor) – The example input data to trace the network architecture.

  • traced_model (torch._C.Graph) – if we alreay has the traced graph of the target model, we donnot need to trace the model again.

  • prune_type (str) – This parameter indicates the channel pruning type: 1) Filter prune the filter of the convolution layer to prune the corresponding channels 2) Batchnorm: prune the channel in the batchnorm layer

build_dependency()[source]

Build the channel dependency for the conv layers in the model.

property dependency_sets

Get the list of the dependency set.

Returns

dependency_sets – list of the dependency sets. For example, [set([‘conv1’, ‘conv2’]), set([‘conv3’, ‘conv4’])]

Return type

list

export(filepath)[source]

export the channel dependencies as a csv file. The layers at the same line have output channel dependencies with each other. For example, layer1.1.conv2, conv1, and layer1.0.conv2 have output channel dependencies with each other, which means the output channel(filters) numbers of these three layers should be same with each other, otherwise the model may has shape conflict. Output example: Dependency Set,Convolutional Layers Set 1,layer1.1.conv2,layer1.0.conv2,conv1 Set 2,layer1.0.conv1 Set 3,layer1.1.conv1

GroupDependency

class nni.compression.pytorch.utils.GroupDependency(model, dummy_input, traced_model=None)[source]

This model analyze the group dependencis between the conv layers in a model.

Parameters
  • model (torch.nn.Module) – The model to be analyzed.

  • data (torch.Tensor) – The example input data to trace the network architecture.

  • traced_model (torch._C.Graph) – if we alreay has the traced graph of the target model, we donnot need to trace the model again.

build_dependency()[source]

Build the channel dependency for the conv layers in the model. This function return the group number of each conv layers. Note that, here, the group count of conv layers may be larger than their originl groups. This is because that the input channel will also be grouped for the group conv layers. To make this clear, assume we have two group conv layers: conv1(group=2), conv2(group=4). conv2 takes the output features of conv1 as input. Then we have to the filters of conv1 can still be divided into 4 groups after filter pruning, because the input channels of conv2 should be divided into 4 groups.

Returns

self.dependency – key: the name of conv layers, value: the minimum value that the number of filters should be divisible to.

Return type

dict

export(filepath)[source]

export the group dependency to a csv file. Each line describes a convolution layer, the first part of each line is the Pytorch module name of the conv layer. The second part of each line is the group count of the filters in this layer. Note that, the group count may be larger than this layers original group number. output example: Conv layer, Groups Conv1, 1 Conv2, 2 Conv3, 4

ChannelMaskConflict

class nni.compression.pytorch.utils.ChannelMaskConflict(masks, model, dummy_input, traced=None)[source]

ChannelMaskConflict fix the mask conflict between the layers that has channel dependecy with each other.

Parameters
  • masks (dict) – a dict object that stores the masks

  • model (torch.nn.Module) – model to fix the mask conflict

  • dummy_input (torch.Tensor) – input example to trace the model

  • graph (torch._C.torch.jit.TopLevelTracedModule) – the traced graph of the target model, is this parameter is not None, we donnot use the model and dummpy_input to get the trace graph.

fix_mask()[source]

Fix the mask conflict before the mask inference for the layers that has shape dependencies. This function should be called before the mask inference of the ‘speedup’ module. Only structured pruning masks are supported.

GroupMaskConflict

class nni.compression.pytorch.utils.GroupMaskConflict(masks, model, dummy_input, traced=None)[source]

GroupMaskConflict fix the mask conflict between the layers that has group dependecy with each other.

Parameters
  • masks (dict) – a dict object that stores the masks

  • model (torch.nn.Module) – model to fix the mask conflict

  • dummy_input (torch.Tensor) – input example to trace the model

  • traced (torch._C.torch.jit.TopLevelTracedModule) – the traced model of the target model, is this parameter is not None, we donnot use the model and dummpy_input to get the trace graph.

fix_mask()[source]

Fix the mask conflict before the mask inference for the layers that has group dependencies. This function should be called before the mask inference of the ‘speedup’ module.

count_flops_params

nni.compression.pytorch.utils.count_flops_params(model, x, custom_ops=None, verbose=True, mode='default')[source]

Count FLOPs and Params of the given model. This function would identify the mask on the module and take the pruned shape into consideration. Note that, for sturctured pruning, we only identify the remained filters according to its mask, and do not take the pruned input channels into consideration, so the calculated FLOPs will be larger than real number.

The FLOPs is counted “per sample”, which means that input has a batch size larger than 1, the calculated FLOPs should not differ from batch size of 1.

Parameters
  • model (nn.Module) – Target model.

  • x (tuple or tensor) – The input shape of data (a tuple), a tensor or a tuple of tensor as input data.

  • custom_ops (dict) – A mapping of (module -> torch.nn.Module : custom operation) the custom operation is a callback funtion to calculate the module flops and parameters, it will overwrite the default operation. for reference, please see ops in ModelProfiler.

  • verbose (bool) – If False, mute detail information about modules. Default is True.

  • mode (str) – the mode of how to collect information. If the mode is set to default, only the information of convolution and linear will be collected. If the mode is set to full, other operations will also be collected.

Returns

Representing total FLOPs, total parameters, and a detailed list of results respectively. The list of results are a list of dict, each of which contains (name, module_type, weight_shape, flops, params, input_size, output_size) as its keys.

Return type

tuple of int, int and dict

compute_sparsity

nni.algorithms.compression.v2.pytorch.utils.pruning.compute_sparsity(origin_model, compact_model, compact_model_masks, config_list)[source]

This function computes how much the origin model has been compressed in the current state. The current state means compact_model + compact_model_masks (i.e., compact_model_masks applied on compact_model). The compact model is the origin model after pruning, and it may have different structure with origin_model cause of speedup.

Parameters
  • origin_model (torch.nn.Module) – The original un-pruned model.

  • compact_model (torch.nn.Module) – The model after speedup or original model.

  • compact_model_masks (Dict[str, Dict[str, Tensor]]) – The masks applied on the compact model, if the original model have been speedup, this should be {}.

  • config_list (List[Dict]) – The config_list used by pruning the original model.

Returns

(current2origin_sparsity, compact2origin_sparsity, mask2compact_sparsity). current2origin_sparsity is how much the origin model has been compressed in the current state. compact2origin_sparsity is the sparsity obtained by comparing the structure of origin model and compact model. mask2compact_sparsity is the sparsity computed by count the zero value in the mask.

Return type

Tuple[List[Dict], List[Dict], List[Dict]]