Version: 25.4.0
Output log breakdown

An example of running ACE on OpenAI's CLIP Model.
Basic flow

The CLIKA ACE engine:
Parses the input model
Applies series of transformations to remove nodes from the model until the graph converges.
Applies quantization, pruning if relevant.
- In-case of quantization, a table will be output to show how the model behaves under different quantization thresholds.
Exports the model to the selected framework.
Output log

CLIKA: Pre-Compiling 'CLIPVisionModel'
Precompiling: [####################] 100% [34/34][0:00:03<0:00:00, it/s: 2.505]
CLIKA: Done Pre-Compiling 'CLIPVisionModel'
CLIKA: Compiling 'CLIPVisionModel'
===============================================================
== License is Valid. Time Left: N days X hours Y minutes ==
===============================================================
Created log at: /home/user/.clika/logs/clika_<TIME_STAMP>.log
[2025-04-20 17:52:50] CLIKA ACE Version: 25.4.0
[2025-04-20 17:52:50] 'torch' Version: 2.6.0+cu124
[2025-04-20 17:52:50] Python Version: 3.11.11 (main, Dec 11 2024, 16:28:39) [GCC 11.2.0]
[2025-04-20 17:52:50] 

Deployment Settings:
    + target_framework = TensorRT (NVIDIA)

[2025-04-20 17:52:51] Starting to parse the model: 'CLIPVisionModel'
Parsing Model: [####################] 100% [485/485][0:00:00<0:00:00, it/s: 337.262] - Done
[2025-04-20 17:52:52] Discarding given model
[2025-04-20 17:52:53] Removed 201 unnecessary nodes
[2025-04-20 17:53:02] Merged 37 similar nodes
[2025-04-20 17:53:05] Removed 2 nodes
[2025-04-20 17:53:10] 

Global Quantization Settings:
    +                      weights_num_bits = [8, 4]
    +                  activations_num_bits = [8]
    +      prefer_weights_only_quantization = False
    + weights_only_quantization_block_sizes = [0, 32, 64, 128, 256, 512]
    +    quantization_sensitivity_threshold = 0.03
    +        weights_utilize_full_int_range = True
    +   one_extra_bit_for_symmetric_weights = None

Deployment Settings:
    + target_framework = TensorRT (NVIDIA)

Equalization Preparation: [####################] 100% [0:00:01<0:00:00, it/s: 22.240]
Equalizing Model: [####################] 100% [0:00:27<0:00:00, it/s: 26.143]
[2025-04-20 17:53:40] Modified 50 nodes.
Calibrating: [####################] 100% [0:00:07<0:00:00, it/s: 4.482]]
Processing Quantization statistics: [####################] 100% [0:00:05<0:00:00, it/s: 51.551]8]
Measuring Quantization Sensitivity: [####################] 100% [0:01:58<0:00:00, it/s: 0.844]
Applying Quantization: [####################] 100% [0:00:18<0:00:00, it/s: 12.056]]
[2025-04-20 17:56:16] 

Quantization Summary:
    # of Quantizable layers: 99
    # of Layers to be Quantized: 59
        * Activations 8 bits: 24
        * Weights 4 bits | Activations 8 bits: 35

    Threshold | # Q-Layers | # Q-Confs    Threshold | # Q-Layers | # Q-Confs
    ----------|------------|----------    ----------|------------|----------
    .....     |         99 |       149    .......   |         27 |        42
    .....     |         90 |       140    .......   |         14 |        20
    .....     |         84 |       134    .......   |          2 |         2                                  

    The table shows how many Layers will be Quantized and
    how many Quantization configurations exist for a given Quantization Sensitivity Threshold.
    Reminder, the sensitivity given in the Quantization Settings was: 0.03

CLIKA: Done Compiling 'CLIPVisionModel'
[2025-04-20 17:56:20] Serializing Chunk 1/1 345MB to: clip/files/openai_clip_vit_base_patch16_vision_model_trt-clika-ckpt-00001-00001.pompom
[2025-04-20 17:56:20] WARNING Inference may not work as intended.
While trying to deploy dynamic shape model, some warning messages have been generated:
1. flatten: Flatten: in dynamic shape inference shape may not always be the same after Flatten operation.
[2025-04-20 17:56:20] WARNING 

===========
Note, you have just used Dynamic-Shape Deployment but the Model is not necessarily Dynamic-Shape Deployment Friendly.
This can be happen since you may have a 'Flatten' layer followed by a 'Linear' layer that expects fixed Input Features.
Or perhaps you may have an AdaptivePooling with output_size != (1, 1, ...)
In-case the Deployment/Inference fails, please try Fixed Shape.
===========

Exporting Model: [####################] 100% [0:00:01<0:00:00, it/s: 389.359]]
[2025-04-20 17:56:24] Overwriting existing model file clip/files/openai_clip_vit_base_patch16_vision_model_trt.onnx
[2025-04-20 17:56:26] Saved model at: clip/files/openai_clip_vit_base_patch16_vision_model_trt.onnx