BitsandbytesQuantization8bits Configuration
Bits and bytes quantization 8 bits parameters.
Parameters
Name | Type | Required | Description |
---|---|---|---|
load_in_8bits | boolean | ❌ | Whether to load the model in 8 bits(LLM.int8() algorithm). Defaults: True |
load_in_4bits | boolean | ❌ | Whether to load the model in 4 bits, default is False. Defaults: False |
llm_int8_enable_fp32_cpu_offload | boolean |