TrainingSettings
CLASS - TrainingSettings(
- num_epochs:
int
= 1- steps_per_epoch:
Optional[int]
= None- evaluation_steps:
Optional[int]
= None- stats_steps:
int
= 50- print_interval:
int
= 100- print_num_steps_averaging:
int
= 20- save_interval:
Optional[int]
= None- reset_training_dataset_between_epochs:
bool
= False- reset_evaluation_dataset_between_epochs:
bool
= True- grads_accumulation_steps:
Optional[int]
= None- mixed_precision:
bool
= False- use_fp16_weights:
bool
= False- use_gradients_checkpoint:
bool
= False- lr_warmup_epochs:
Optional[int]
= 1- lr_warmup_steps_per_epoch:
int
= 500- random_seed:
Optional[int]
= int(time.time())- skip_initial_evaluation:
bool
= False- clip_grad_norm:
Union[int, float, None]
= None- clip_grad_norm_type:
Union[int, float, None]
= 2.0)
This Dataclass represents the Training Settings for the Compression.
Class Variables
- num_epochs (
int
) - Number of epochs to run for - steps_per_epoch (
Optional[int]
) - Number of steps per each epoch. If None will run until Training Dataloader is Exhausted every Epoch. - evaluation_steps (
Optional[int]
) - Number of evaluation steps after each Epoch. If None will run until Evaluation Dataloader is exhausted. - stats_steps (
int
) - Number of steps to run and collect statistics of the Model. - print_interval (
int
) - Interval to Print results to the Console - print_num_steps_averaging (
int
) - Number of steps to average for Loss/Metrics printed out to the Console - save_interval (
Optional[int]
) - Interval to save the CLIKA Model Format. Default is 1. - reset_training_dataset_between_epochs (
bool
) - Boolean value to indicate if to reset the Training Dataloader after every epoch - reset_evaluation_dataset_between_epochs (
bool
) - Boolean value to indicate if to reset the Evaluation Dataloader after every epoch - grads_accumulation_steps (
Optional[int]
) - For how many steps to accumulate Gradients. This helps in-case Model is too big to fit into a GPU to simulate a bigger batch. Default is None. - mixed_precision (
bool
) - Whether or not to use Mixed Precision Training - use_fp16_weights (
bool
) - Whether or not to use FP16 weights training. This helps reduce memory requirement of the Model but may cause NaN loss for some models. - use_gradients_checkpoint (
bool
) - Whether or not to use CPU Offloading of Activations. This helps reduce memory requirement of the Model but may increase Iteration time of Compression - lr_warmup_epochs (
Optional[int]
) - Number of epochs to run Learning Rate warmup. Can be None or 0. - lr_warmup_steps_per_epoch (
int
) - For how many iterations per Epoch to run Warmup. This argument is ignored in-case 'lr_warmup_epochs' is None or 0. - random_seed (
Optional[int]
) - Setting a Random Seed. - skip_initial_evaluation (
bool
) - Whether or not to skip the original model Evaluation for comparison later in the Compression - clip_grad_norm (
Union[int, float, None]
) - Clip Grad Norms. Clipping Grad Norm will be applied only if 'clip_grad_norm' is not None. - clip_grad_norm_type (
Union[int, float, None]
) - Type of Clip Grad Norms. Can be float('inf') for Infinity Norm. Can be 2 for L2 Norm, 1 for L1 Norm