Compression Utilities

ChannelDependency

class nni.compression.pytorch.utils.ChannelDependency(model, dummy_input, traced_model=None, prune_type='Filter')[source]

This model analyze the channel dependencies between the conv layers in a model.

Parameters:
  • model (torch.nn.Module) – The model to be analyzed.

  • data (torch.Tensor) – The example input data to trace the network architecture.

  • traced_model (torch._C.Graph) – if we alreay has the traced graph of the target model, we donnot need to trace the model again.

  • prune_type (str) – This parameter indicates the channel pruning type: 1) Filter prune the filter of the convolution layer to prune the corresponding channels 2) Batchnorm: prune the channel in the batchnorm layer

build_dependency()[source]

Build the channel dependency for the conv layers in the model.

property dependency_sets

Get the list of the dependency set.

Returns:

dependency_sets – list of the dependency sets. For example, [set([‘conv1’, ‘conv2’]), set([‘conv3’, ‘conv4’])]

Return type:

list

export(filepath)[source]

export the channel dependencies as a csv file. The layers at the same line have output channel dependencies with each other. For example, layer1.1.conv2, conv1, and layer1.0.conv2 have output channel dependencies with each other, which means the output channel(filters) numbers of these three layers should be same with each other, otherwise the model may has shape conflict. Output example: Dependency Set,Convolutional Layers Set 1,layer1.1.conv2,layer1.0.conv2,conv1 Set 2,layer1.0.conv1 Set 3,layer1.1.conv1

GroupDependency

class nni.compression.pytorch.utils.GroupDependency(model, dummy_input, traced_model=None)[source]

This model analyze the group dependencis between the conv layers in a model.

Parameters:
  • model (torch.nn.Module) – The model to be analyzed.

  • dummy_input (torch.Tensor) – The example input data to trace the network architecture.

  • traced_model (torch._C.Graph) – if we alreay has the traced graph of the target model, we donnot need to trace the model again.

build_dependency()[source]

Build the channel dependency for the conv layers in the model. This function return the group number of each conv layers. Note that, here, the group count of conv layers may be larger than their originl groups. This is because that the input channel will also be grouped for the group conv layers. To make this clear, assume we have two group conv layers: conv1(group=2), conv2(group=4). conv2 takes the output features of conv1 as input. Then we have to the filters of conv1 can still be divided into 4 groups after filter pruning, because the input channels of conv2 should be divided into 4 groups.

Returns:

self.dependency – key: the name of conv layers, value: the minimum value that the number of filters should be divisible to.

Return type:

dict

export(filepath)[source]

export the group dependency to a csv file. Each line describes a convolution layer, the first part of each line is the Pytorch module name of the conv layer. The second part of each line is the group count of the filters in this layer. Note that, the group count may be larger than this layers original group number. output example: Conv layer, Groups Conv1, 1 Conv2, 2 Conv3, 4

ChannelMaskConflict

class nni.compression.pytorch.utils.ChannelMaskConflict(masks, model, dummy_input, traced=None)[source]

ChannelMaskConflict fix the mask conflict between the layers that has channel dependecy with each other.

Parameters:
  • masks (dict) – a dict object that stores the masks

  • model (torch.nn.Module) – model to fix the mask conflict

  • dummy_input (torch.Tensor) – input example to trace the model

  • graph (torch._C.torch.jit.TopLevelTracedModule) – the traced graph of the target model, is this parameter is not None, we donnot use the model and dummpy_input to get the trace graph.

fix_mask()[source]

Fix the mask conflict before the mask inference for the layers that has shape dependencies. This function should be called before the mask inference of the ‘speedup’ module. Only structured pruning masks are supported.

GroupMaskConflict

class nni.compression.pytorch.utils.GroupMaskConflict(masks, model, dummy_input, traced=None)[source]

GroupMaskConflict fix the mask conflict between the layers that has group dependecy with each other.

Parameters:
  • masks (dict) – a dict object that stores the masks

  • model (torch.nn.Module) – model to fix the mask conflict

  • dummy_input (torch.Tensor) – input example to trace the model

  • traced (torch._C.torch.jit.TopLevelTracedModule) – the traced model of the target model, is this parameter is not None, we donnot use the model and dummpy_input to get the trace graph.

fix_mask()[source]

Fix the mask conflict before the mask inference for the layers that has group dependencies. This function should be called before the mask inference of the ‘speedup’ module.

count_flops_params

nni.compression.pytorch.utils.count_flops_params(model, x, custom_ops=None, verbose=True, mode='default')[source]

Count FLOPs and Params of the given model. This function would identify the mask on the module and take the pruned shape into consideration. Note that, for sturctured pruning, we only identify the remained filters according to its mask, and do not take the pruned input channels into consideration, so the calculated FLOPs will be larger than real number.

The FLOPs is counted “per sample”, which means that input has a batch size larger than 1, the calculated FLOPs should not differ from batch size of 1.

Parameters:
  • model (nn.Module) – Target model.

  • x (tuple or tensor) – The input shape of data (a tuple), a tensor or a tuple of tensor as input data.

  • custom_ops (dict) – A mapping of (module -> torch.nn.Module : custom operation) the custom operation is a callback funtion to calculate the module flops and parameters, it will overwrite the default operation. for reference, please see ops in ModelProfiler.

  • verbose (bool) – If False, mute detail information about modules. Default is True.

  • mode (str) – the mode of how to collect information. If the mode is set to default, only the information of convolution and linear will be collected. If the mode is set to full, other operations will also be collected.

Returns:

Representing total FLOPs, total parameters, and a detailed list of results respectively. The list of results are a list of dict, each of which contains (name, module_type, weight_shape, flops, params, input_size, output_size) as its keys.

Return type:

tuple of int, int and dict

compute_sparsity

nni.compression.pytorch.utils.pruning.compute_sparsity(origin_model, compact_model, compact_model_masks, config_list)[source]

This function computes how much the origin model has been compressed in the current state. The current state means compact_model + compact_model_masks (i.e., compact_model_masks applied on compact_model). The compact model is the origin model after pruning, and it may have different structure with origin_model cause of speedup.

Parameters:
  • origin_model (torch.nn.Module) – The original un-pruned model.

  • compact_model (torch.nn.Module) – The model after speedup or original model.

  • compact_model_masks (Dict[str, Dict[str, Tensor]]) – The masks applied on the compact model, if the original model have been speedup, this should be {}.

  • config_list (List[Dict]) – The config_list used by pruning the original model.

Returns:

(current2origin_sparsity, compact2origin_sparsity, mask2compact_sparsity). current2origin_sparsity is how much the origin model has been compressed in the current state. compact2origin_sparsity is the sparsity obtained by comparing the structure of origin model and compact model. mask2compact_sparsity is the sparsity computed by count the zero value in the mask.

Return type:

Tuple[List[Dict], List[Dict], List[Dict]]