vLLM Inference
DB-GPT supports vLLM inference, a fast and easy-to-use LLM inference and service library.
Install dependencies
vLLM
is an optional dependency in DB-GPT. You can install it manually through the following command.
pip install -e ".[vllm]"
Modify configuration file
In the .env
configuration file, modify the inference type of the model to start vllm
inference.
LLM_MODEL=glm-4-9b-chat
MODEL_TYPE=vllm
# modify the following configuration if you possess GPU resources
# gpu_memory_utilization=0.8
For more information about the list of models supported by vLLM
, please refer to the vLLM supported model document.