Pruning Config Specification¶
The Keys in Config List¶
Each sub-config in the config list is a dict, and the scope of each setting (key) is only internal to each sub-config. If multiple sub-configs are configured for the same layer, the later ones will overwrite the previous ones.
op_types¶
The type of the layers targeted by this sub-config.
If op_names
is not set in this sub-config, all layers in the model that satisfy the type will be selected.
If op_names
is set in this sub-config, the selected layers should satisfy both type and name.
op_names¶
The name of the layers targeted by this sub-config.
If op_types
is set in this sub-config, the selected layer should satisfy both type and name.
op_partial_names¶
This key is for the layers to be pruned with names that have the same sub-string. NNI will find all names in the model,
find names that contain one of op_partial_names
, and append them into the op_names
.
sparsity_per_layer¶
The sparsity ratio of each selected layer.
e.g., the sparsity_per_layer
is 0.8 means each selected layer will mask 80% values on the weight.
If layer_1
(500 parameters) and layer_2
(1000 parameters) are selected in this sub-config,
then layer_1
will be masked 400 parameters and layer_2
will be masked 800 parameters.
total_sparsity¶
The sparsity ratio of all selected layers, means that sparsity ratio may no longer be even between layers.
e.g., the total_sparsity
is 0.8 means 80% of parameters in this sub-config will be masked.
If layer_1
(500 parameters) and layer_2
(1000 parameters) are selected in this sub-config,
then layer_1
and layer_2
will be masked a total of 1200 parameters,
how these total parameters are distributed between the two layers is determined by the pruning algorithm.
sparsity¶
sparsity
is an old config key from the pruning v1, it has the same meaning as sparsity_per_layer
.
You can also use sparsity
right now, but it will be deprecated in the future.
max_sparsity_per_layer¶
This key is usually used with total_sparsity
. It limits the maximum sparsity ratio of each layer.
In total_sparsity
example, there are 1200 parameters that need to be masked and all parameters in layer_1
may be totally masked.
To avoid this situation, max_sparsity_per_layer
can be set as 0.9, this means up to 450 parameters can be masked in layer_1
,
and 900 parameters can be masked in layer_2
.
exclude¶
The exclude
and sparsity
keyword are mutually exclusive and cannot exist in the same sub-config.
If exclude
is set in sub-config, the layers selected by this config will not be pruned.