BitsandbytesQuantization8bits Configuration
Bits and bytes quantization 8 bits parameters.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
load_in_8bits | boolean | ❌ | Whether to load the model in 8 bits(LLM.int8() algorithm). Defaults: True |
load_in_4bits | boolean | ❌ | Whether to load the model in 4 bits, default is False. Defaults: False |
llm_int8_enable_fp32_cpu_offload | boolean |