Version: dev

vLLM Inference

DB-GPT supports vLLM inference, a fast and easy-to-use LLM inference and service library.

Install dependencies

vLLM is an optional dependency in DB-GPT. You can install it by adding the extra --extra "vllm" when installing dependencies.

# Use uv to install dependencies needed for vllm
# Install core dependencies and select desired extensions
uv sync --all-packages \
--extra "base" \
--extra "hf" \
--extra "cuda121" \
--extra "vllm" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "quant_bnb" \
--extra "dbgpts"

Modify configuration file

After installing the dependencies, you can modify your configuration file to use the vllm provider.

# Model Configurations
[models]
[[models.llms]]
name = "THUDM/glm-4-9b-chat-hf"
provider = "vllm"
# If not provided, the model will be downloaded from the Hugging Face model hub
# uncomment the following line to specify the model path in the local file system
# path = "the-model-path-in-the-local-file-system"

For more information about the list of models supported by vLLM, please refer to the vLLM supported model document.

vLLM Inference

Install dependencies​

Modify configuration file​

Install dependencies

Modify configuration file