Release Notes
24.9.0 - September 10th 2024
note
Please note that we generally add support for both torch.function_name
and
Tensor.function_name
but if you encounter a case where only one or the other
is supported, please let us know at support@clika.io
.
Features
- BFloat16 training is now possible - This will help reduce memory requirements.
Improvements
- Improved the
torch.compile
functionality by trying more ways to do so without failing first,
Bug fixes
- Fixed an issue with Detach layer. The SDK was unable to serialize it properly.
- Fixed an issue with Tensor.size() not being correctly connected to successor nodes.
- Fixed an issue with multiple ClikaModules living concurrently. After
torch.compile(..., backend='clika')
if you loaded anotherClikaModule
from serialized it could potentially fail due to global state collision. - Significantly improved the 'svg' file generation runtime using
ClikaModule.clika_visualize
method. On gigantic models it used to hang, seemingly forever. Let us know if there are any further issues. - Fixed an issue when deploying Clamp layer to ONNX.
- Fixed an Input ordering issue when
input_names
is provided totorch.onnx.export
Future features
- Full Float16 training - the parameters will be in Float16
- More work on Multi-GPU capability and make it more efficient
- Optimization for VRAM/RAM consumptions
- Adding a method on the
ClikaModule
object to get the Quantization Sensitivities sorted by highest first. This will make understanding your model easier in addition to having the 'svg' files. - Additional Quantization Algorithm that will reduce the time even further.