Module Fusion¶
Module fusion is a new feature in the quantizatizer of NNI 3.0. This feature can fuse the specified sub-models in the simulated quantization process to align with the inference stage of model deployment, reducing the error between the simulated quantization and inference stages.
Users can use this feature by directly defining fuse_names
in each configure of config_list.
fuse_names
is an optional parameter of type List[(str,)]
. Each tuple specifies the name of the module
to be fused in the current configure in the model. Meanwhile, each tuple has 2 or 3 elements, and the first module
in each tuple is the fused module, which contains all the operations of all the modules in the tuple.
The rest of the modules will be replaced by Identity
during the quantization process. Here is an example:
# define the Mnist Model
class Mnist(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv1 = torch.nn.Conv2d(1, 20, 5, 1)
self.conv2 = torch.nn.Conv2d(20, 50, 5, 1)
self.fc1 = torch.nn.Linear(4 * 4 * 50, 500)
self.fc2 = torch.nn.Linear(500, 10)
self.relu1 = torch.nn.ReLU6()
self.relu2 = torch.nn.ReLU6()
self.relu3 = torch.nn.ReLU6()
self.max_pool1 = torch.nn.MaxPool2d(2, 2)
self.max_pool2 = torch.nn.MaxPool2d(2, 2)
self.batchnorm1 = torch.nn.BatchNorm2d(20)
def forward(self, x):
x = self.relu1(self.batchnorm1(self.conv1(x)))
x = self.max_pool1(x)
x = self.relu2(self.conv2(x))
x = self.max_pool2(x)
x = x.view(-1, 4 * 4 * 50)
x = self.relu3(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
# define the config list
config_list = [
{
'target_names':['_input_', 'weight', '_output_'],
'op_names': ['conv1'],
'quant_dtype': 'int8',
'quant_scheme': 'affine',
'granularity': 'default',
'fuse_names': [("conv1", "batchnorm1")]
}]