Skip to main content
Version: v0.7.0

llm Configuration

This document provides an overview of all configuration classes in llm type.

Configuration Classes

ClassDescriptionDocumentation
BaichuanDeployModelParametersBaichuan Proxy LLMView Details
BitsandbytesQuantizationBits and bytes quantization parameters.View Details
BitsandbytesQuantization4bitsBits and bytes quantization 4 bits parameters.View Details
BitsandbytesQuantization8bitsBits and bytes quantization 8 bits parameters.View Details
ClaudeDeployModelParametersClaude Proxy LLMView Details
DeepSeekDeployModelParametersDeepseek proxy LLM configuration.View Details
GeminiDeployModelParametersGoogle Gemini proxy LLM configuration.View Details
GiteeDeployModelParametersGitee proxy LLM configuration.View Details
HFLLMDeployModelParametersLocal deploy model parameters.View Details
LlamaCppModelParametersLlamaCppModelParameters(name: str, provider: str = 'llama.cpp', verbose: Optional[bool] = False, concurrency: Optional[int] = 5, backend: Optional[str] = None, prompt_template: Optional[str] = None, context_length: Optional[int] = None, reasoning_model: Optional[bool] = None, path: Optional[str] = None, device: Optional[str] = None, seed: Optional[int] = -1, n_threads: Optional[int] = None, n_batch: Optional[int] = 512, n_gpu_layers: Optional[int] = 1000000000, n_gqa: Optional[int] = None, rms_norm_eps: Optional[float] = 5e-06, cache_capacity: Optional[str] = None, prefer_cpu: Optional[bool] = False)View Details
LlamaServerParametersLlamaServerParameters(name: str, provider: str = 'llama.cpp.server', verbose: Optional[bool] = False, concurrency: Optional[int] = 20, backend: Optional[str] = None, prompt_template: Optional[str] = None, context_length: Optional[int] = None, reasoning_model: Optional[bool] = None, path: Optional[str] = None, model_hf_repo: Optional[str] = None, model_hf_file: Optional[str] = None, device: Optional[str] = None, server_bin_path: Optional[str] = None, server_host: str = '127.0.0.1', server_port: int = 0, temperature: float = 0.8, seed: int = 42, debug: bool = False, model_url: Optional[str] = None, model_draft: Optional[str] = None, threads: Optional[int] = None, n_gpu_layers: Optional[int] = None, batch_size: Optional[int] = None, ubatch_size: Optional[int] = None, ctx_size: Optional[int] = None, grp_attn_n: Optional[int] = None, grp_attn_w: Optional[int] = None, n_predict: Optional[int] = None, slot_save_path: Optional[str] = None, n_slots: Optional[int] = None, cont_batching: bool = False, embedding: bool = False, reranking: bool = False, metrics: bool = False, slots: bool = False, draft: Optional[int] = None, draft_max: Optional[int] = None, draft_min: Optional[int] = None, api_key: Optional[str] = None, lora_files: List[str] = <factory>, no_context_shift: bool = False, no_webui: Optional[bool] = None, startup_timeout: Optional[int] = None)View Details
MoonshotDeployModelParametersMoonshot proxy LLM configuration.View Details
OllamaDeployModelParametersOllama proxy LLM configuration.View Details
OpenAICompatibleDeployModelParametersOpenAI Compatible Proxy LLMView Details
SiliconFlowDeployModelParametersSiliconFlow proxy LLM configuration.View Details
SparkDeployModelParametersXunfei Spark proxy LLM configuration.View Details
TongyiDeployModelParametersTongyi proxy LLM configuration.View Details
VLLMDeployModelParametersLocal deploy model parameters.View Details
VolcengineDeployModelParametersVolcengine proxy LLM configuration.View Details
WenxinDeployModelParametersBaidu Wenxin proxy LLM configuration.View Details
YiDeployModelParametersYi proxy LLM configuration.View Details
ZhipuDeployModelParametersZhipu proxy LLM configuration.View Details