Configuration file
A configuration file allows the user to save and load Settings
object for the CCO.
Example YAML file
# ================= CLIKA ACE hyperparameter configuration file ================= #
deployment_settings:
# Choose the target framework ["tflite", "ov", "ort", "trt", "qnn"]
target_framework: trt
# (OPTIONAL) Set True if you're planning to run the model on a CPU that supports AVX512-VNNI or on an ARM Device
# Only applicable for "ov", "ort", "qnn"
# weights_utilize_full_int_range: false
training_settings:
# Number of epochs to CCO
num_epochs: 100
# Gradient accumulation steps (useful for larger models that must run with a smaller batch size)
grads_acc_steps: 1
# Number of steps to take per epoch
steps_per_epoch: 10000
evaluation_steps: null
# Number of warm-up epochs/steps to take
lr_warmup_epochs: 1
# Number of steps each epoch of the Learning Rate warm-up stage
lr_warmup_steps_per_epoch: 500
# Use Automatic Mixed Precision (reduce VRAM usage), choose if to use half precision automatically for the weights, FP32 for the gradients.
# AMP dtype: [float16, bfloat16, null]
amp_dtype: null
# Specify weight dtype of the model: [float16, bfloat16, null]
# if null use default (float32)
weights_dtype: null
# Number of steps for initial quantization calibration
stats_steps: 20
# Use activations checkpointing that offloads the activations to CPU,
#This helps reduce memory requirement of the Model but may increase iteration time of compression
activations_offloading: false
# Use gradient checkpointing that offloads the activations to CPU,
# This helps reduce memory requirement of the Model but may increase iteration time of compression
params_offloading: false
# Enable gradient clipping, use null or comment to disable
clip_grad_norm_val: null
clip_grad_norm_type: 2.0
# .pompom files save interval in epochs
save_interval: null
# Print log every x steps
print_interval: 100
# Printing moving average window size
print_ma_window_size: 50
# Reset train-loader/eval-loader between epochs
reset_train_data: false
reset_eval_data: true
# Skip initial evaluation before compression
skip_initial_eval: false
# Random seed applied on CLIKA SDK
random_seed: null
# Ignoring `--ckpt` (if given) adn indicates CLIKA SDK that the model has untrained weight
is_training_from_scratch: false
global_quantization_settings:
method: qat
# How many bits to use for the Weights for Quantization
weights_num_bits: 8
# How many bits to use for the Activation for Quantization
activations_num_bits: 8
# Whether to skip quantization for the tail of the model (keep it null if unsure)
skip_tail_quantization: null
# Whether to automatically skip quantization for sensitive layers (keep it true if unsure)
# The threshold to decide automatically whether to skip quantization for layers that are too sensitive.
# This will only be applied if 'automatic_skip_quantization' is True.
# Some tips:
# * For small models like MobileNet - 5000 is a good value
# * For big models 10000 is a good value
# The quantization sensitivity is measured using L2(QuantizedTensor-FloatTensor), the higher it is the more "destructive" the quantization is.
# This also implies that it can take longer for a Model to recover its performance if it is overly sensitive.
automatic_skip_quantization: true
quantization_sensitivity_threshold: null
# (OPTIONAL) Uncomment if you would like to enable LORA
# global_lora_settings:
# rank: 2
# alpha: 1
# dropout_rate: 0.05
distributed_training_settings:
# Enable multi-gpu training
multi_gpu: false
# Enable FSDP (use_sharding=True) if true else use DDP (use_sharding=False)
use_sharding: false
# (OPTIONAL) Layer compression setting
# See https://docs.clika.io/docs/quantization_guide
# layer_settings:
# conv:
# quantization_settings:
# weights_num_bits: 8
# activations_num_bits: 8
Configuration schema
You can set the following parameters in the configuration file:
training_settings
num_epochs
:int
stats_steps
:int
steps_per_epoch
: OneOf [int
, "null"]evaluation_steps
: OneOf [int
, "null"]print_interval
:int
print_ma_window_size
:int
save_interval
: OneOf [int
, "null"]reset_train_data
:bool
reset_eval_data
:bool
grads_acc_steps
: OneOf [int
, "null"]amp_dtype
: OneOf ["bfloat16", "float16",bool
]]weights_dtype
: OneOf ["bfloat16", "float16",bool
,"null"]activations_offloading
:bool
params_offloading
:bool
lr_warmup_epochs
: OneOf [int
, "null"]lr_warmup_steps_per_epoch
:int
random_seed
: OneOf [int
, "null"]skip_initial_eval
:bool
clip_grad_norm_val
: OneOf [int
,float
, "null"]clip_grad_norm_type
: OneOf [int
,float
, "null"]is_training_from_scratch
:bool
global_quantization_settings
:global_lora_settings
:deployment_settings
:target_framework
: OneOf ["tflite", "ov", "ort", "trt", "qnn"]weights_utilize_full_int_range
: OneOf [bool
, "null"]
distributed_training_settings
:layer_settings
:
Saving and loading configuration files
To save a YAML file from an existing settings object you may use the Settings.save
and Settings.load_from_path
methods as follows:
from clika_compression import Settings
path = 'config.yml'
settings = Settings() # Default settings
# Do some modification to the settings object
settings.training_settings.num_epochs = 10
...
settings.save(path) # Save as a yaml file
To load a YAML file use:
from clika_compression import Settings
path = '/path/to/config.yml'
settings = Settings.load_from_path(path)