TrainingSettings

Version: Latest

TrainingSettings

CLASS - TrainingSettings(
num_epochs: int = 1
steps_per_epoch: Optional[int] = None
evaluation_steps: Optional[int] = None
stats_steps: int = 50
print_interval: int = 100
print_num_steps_averaging: int = 20
save_interval: Optional[int] = None
reset_training_dataset_between_epochs: bool = False
reset_evaluation_dataset_between_epochs: bool = True
grads_accumulation_steps: Optional[int] = None
mixed_precision: bool = False
use_fp16_weights: bool = False
use_gradients_checkpoint: bool = False
lr_warmup_epochs: Optional[int] = 1
lr_warmup_steps_per_epoch: int = 500
random_seed: Optional[int] = int(time.time())
skip_initial_evaluation: bool = False
clip_grad_norm: Union[int, float, None] = None
clip_grad_norm_type: Union[int, float, None] = 2.0
)

This Dataclass represents the Training Settings for the Compression.

Class Variables

num_epochs (int) - Number of epochs to run for
steps_per_epoch (Optional[int]) - Number of steps per each epoch. If None will run until Training Dataloader is Exhausted every Epoch.
evaluation_steps (Optional[int]) - Number of evaluation steps after each Epoch. If None will run until Evaluation Dataloader is exhausted.
stats_steps (int) - Number of steps to run and collect statistics of the Model.
print_interval (int) - Interval to Print results to the Console
print_num_steps_averaging (int) - Number of steps to average for Loss/Metrics printed out to the Console
save_interval (Optional[int]) - Interval to save the CLIKA Model Format. Default is 1.
reset_training_dataset_between_epochs (bool) - Boolean value to indicate if to reset the Training Dataloader after every epoch
reset_evaluation_dataset_between_epochs (bool) - Boolean value to indicate if to reset the Evaluation Dataloader after every epoch
grads_accumulation_steps (Optional[int]) - For how many steps to accumulate Gradients. This helps in-case Model is too big to fit into a GPU to simulate a bigger batch. Default is None.
mixed_precision (bool) - Whether or not to use Mixed Precision Training
use_fp16_weights (bool) - Whether or not to use FP16 weights training. This helps reduce memory requirement of the Model but may cause NaN loss for some models.
use_gradients_checkpoint (bool) - Whether or not to use CPU Offloading of Activations. This helps reduce memory requirement of the Model but may increase Iteration time of Compression
lr_warmup_epochs (Optional[int]) - Number of epochs to run Learning Rate warmup. Can be None or 0.
lr_warmup_steps_per_epoch (int) - For how many iterations per Epoch to run Warmup. This argument is ignored in-case 'lr_warmup_epochs' is None or 0.
random_seed (Optional[int]) - Setting a Random Seed.
skip_initial_evaluation (bool) - Whether or not to skip the original model Evaluation for comparison later in the Compression
clip_grad_norm (Union[int, float, None]) - Clip Grad Norms. Clipping Grad Norm will be applied only if 'clip_grad_norm' is not None.
clip_grad_norm_type (Union[int, float, None]) - Type of Clip Grad Norms. Can be float('inf') for Infinity Norm. Can be 2 for L2 Norm, 1 for L1 Norm

TrainingSettings​

Class Variables​

TrainingSettings

Class Variables