It requires instruction the product with large precision and afterwards quantizing the weights and activations to lower precision through the inference section. This enables for a more compact product dimension whilst retaining significant general performance. As quantization represents design parameters with decrease-bit integer (e.g., int8), the