Skip to main content
Version: Latest

TrainingSettings

CLASS - TrainingSettings(

  • num_epochs: int =  1
  • steps_per_epoch: Optional[int] =  None
  • evaluation_steps: Optional[int] =  None
  • stats_steps: int =  50
  • print_interval: int =  100
  • print_num_steps_averaging: int =  20
  • save_interval: Optional[int] =  None
  • reset_training_dataset_between_epochs: bool =  False
  • reset_evaluation_dataset_between_epochs: bool =  True
  • grads_accumulation_steps: Optional[int] =  None
  • mixed_precision: bool =  False
  • use_fp16_weights: bool =  False
  • use_gradients_checkpoint: bool =  False
  • lr_warmup_epochs: Optional[int] =  1
  • lr_warmup_steps_per_epoch: int =  500
  • random_seed: Optional[int] =  int(time.time())
  • skip_initial_evaluation: bool =  False
  • clip_grad_norm: Union[int, float, None] =  None
  • clip_grad_norm_type: Union[int, float, None] =  2.0

)

This Dataclass represents the Training Settings for the Compression.

Class Variables

  • num_epochs (int) - Number of epochs to run for
  • steps_per_epoch (Optional[int]) - Number of steps per each epoch. If None will run until Training Dataloader is Exhausted every Epoch.
  • evaluation_steps (Optional[int]) - Number of evaluation steps after each Epoch. If None will run until Evaluation Dataloader is exhausted.
  • stats_steps (int) - Number of steps to run and collect statistics of the Model.
  • print_interval (int) - Interval to Print results to the Console
  • print_num_steps_averaging (int) - Number of steps to average for Loss/Metrics printed out to the Console
  • save_interval (Optional[int]) - Interval to save the CLIKA Model Format. Default is 1.
  • reset_training_dataset_between_epochs (bool) - Boolean value to indicate if to reset the Training Dataloader after every epoch
  • reset_evaluation_dataset_between_epochs (bool) - Boolean value to indicate if to reset the Evaluation Dataloader after every epoch
  • grads_accumulation_steps (Optional[int]) - For how many steps to accumulate Gradients. This helps in-case Model is too big to fit into a GPU to simulate a bigger batch. Default is None.
  • mixed_precision (bool) - Whether or not to use Mixed Precision Training
  • use_fp16_weights (bool) - Whether or not to use FP16 weights training. This helps reduce memory requirement of the Model but may cause NaN loss for some models.
  • use_gradients_checkpoint (bool) - Whether or not to use CPU Offloading of Activations. This helps reduce memory requirement of the Model but may increase Iteration time of Compression
  • lr_warmup_epochs (Optional[int]) - Number of epochs to run Learning Rate warmup. Can be None or 0.
  • lr_warmup_steps_per_epoch (int) - For how many iterations per Epoch to run Warmup. This argument is ignored in-case 'lr_warmup_epochs' is None or 0.
  • random_seed (Optional[int]) - Setting a Random Seed.
  • skip_initial_evaluation (bool) - Whether or not to skip the original model Evaluation for comparison later in the Compression
  • clip_grad_norm (Union[intfloatNone]) - Clip Grad Norms. Clipping Grad Norm will be applied only if 'clip_grad_norm' is not None.
  • clip_grad_norm_type (Union[intfloatNone]) - Type of Clip Grad Norms. Can be float('inf') for Infinity Norm. Can be 2 for L2 Norm, 1 for L1 Norm