Skip to main content
Version: 24.8.0

DistributedTrainingSettings

CLASS - DistributedTrainingSettings(

  • multi_gpu: bool =  False
  • use_sharding: bool =  False

)

This Dataclass holds all the information necessary to configure distributed training.

Class Variables

  • multi_gpu (bool) - Whether or not to use multi_gpu training for the compression.
  • use_sharding (bool) - Whether or not to use sharding to split the model and optimizer states across the GPUs. Can help fit bigger models but may come at a cost of increased latency. This is similar to PyTorch FSDP / DeepSpeed.