跳到主要内容
版本:v0.6.0

Source Code Deployment

Environmental requirements

Startup ModeCPU * MEMGPURemark
Proxy model4C * 8GNoneProxy model does not rely on GPU
Local model8C * 32G24GIt is best to start locally with a GPU of 24G or above

Download source code

提示

Download DB-GPT

git clone https://github.com/eosphoros-ai/DB-GPT.git

Miniconda environment installation

  • The default database uses SQLite, so there is no need to install a database in the default startup mode. If you need to use other databases, you can read the advanced tutorials below. We recommend installing the Python virtual environment through the conda virtual environment. For the installation of Miniconda environment, please refer to the Miniconda installation tutorial.
提示

Create a Python virtual environment

python >= 3.10
conda create -n dbgpt_env python=3.10
conda activate dbgpt_env

# it will take some minutes
pip install -e ".[default]"
提示

Copy environment variables

cp .env.template  .env

Model deployment

DB-GPT can be deployed on servers with lower hardware through proxy model, or as a private local model under the GPU environment. If your hardware configuration is low, you can use third-party large language model API services, such as OpenAI, Azure, Qwen, ERNIE Bot, etc.

note

⚠️ You need to ensure that git-lfs is installed

● CentOS installation: yum install git-lfs
● Ubuntu installation: apt-get install git-lfs
● MacOS installation: brew install git-lfs

Proxy model

Install dependencies

pip install  -e ".[openai]"

Download embedding model

cd DB-GPT
mkdir models and cd models
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese

Configure the proxy and modify LLM_MODEL, PROXY_API_URL and API_KEY in the .envfile

# .env
LLM_MODEL=chatgpt_proxyllm
PROXY_API_KEY={your-openai-sk}
PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
# If you use gpt-4
# PROXYLLM_BACKEND=gpt-4
note

⚠️ Be careful not to overwrite the contents of the .env configuration file

Local model

Hardware requirements description
ModelQuantizeVRAM Size
Vicuna-7b-1.54-bit8GB
Vicuna-7b-1.58-bit12GB
Vicuna-13b-v1.54-bit12GB
Vicuna-13b-v1.58-bit24GB
Download LLM
cd DB-GPT
mkdir models and cd models

# embedding model
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
or
git clone https://huggingface.co/moka-ai/m3e-large

# llm model, if you use openai or Azure or tongyi llm api service, you don't need to download llm model
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5

Environment variable configuration, configure the LLM_MODEL parameter in the .env file
# .env
LLM_MODEL=vicuna-13b-v1.5

llama.cpp(CPU)

note

⚠️ llama.cpp can be run on Mac M1 or Mac M2

DB-GPT also supports the lower-cost inference framework llama.cpp, which can be used through llama-cpp-python.

Document preparation

Before using llama.cpp, you first need to prepare the model file in gguf format. There are two ways to obtain it. You can choose one method to obtain the corresponding file.

提示

Method 1: Download the converted model

If you want to use Vicuna-13b-v1.5, you can download the converted file TheBloke/vicuna-13B-v1.5-GGUF, only this one file is needed. Download the file and put it in the model path. You need to rename the model to: ggml-model-q4_0.gguf.

wget https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF/resolve/main/vicuna-13b-v1.5.Q4_K_M.gguf -O models/ggml-model-q4_0.gguf
提示

Method 2: Convert files yourself

During use, you can also convert the model file yourself according to the instructions in llama.cpp#prepare-data–run, and place the converted file in the models directory and name it ggml-model-q4_0.gguf.

Install dependencies

llama.cpp is an optional installation item in DB-GPT. You can install it with the following command.

pip install -e ".[llama_cpp]"

Modify configuration file

Modify the .env file to use llama.cpp, and then you can start the service by running the command

More descriptions

environment variablesdefault valuedescription
llama_cpp_prompt_templateNonePrompt template now supports zero_shot, vicuna_v1.1,alpaca,llama-2,baichuan-chat,internlm-chat. If it is None, the model Prompt template can be automatically obtained according to the model path.
llama_cpp_model_pathNonemodel path
llama_cpp_n_gpu_layers1000000000How many network layers to transfer to the GPU, set this to 1000000000 to transfer all layers to the GPU. If your GPU is low on memory, you can set a lower number, for example: 10.
llama_cpp_n_threadsNoneThe number of threads to use. If None, the number of threads will be determined automatically.
llama_cpp_n_batch512The maximum number of prompt tokens to be batched together when calling llama_eval
llama_cpp_n_gqaNoneFor the llama-2 70B model, Grouped-query attention must be 8.
llama_cpp_rms_norm_eps5e-06For the llama-2 model, 5e-6 is a good value.
llama_cpp_cache_capacityNoneMaximum model cache size. For example: 2000MiB, 2GiB
llama_cpp_prefer_cpuFalseIf a GPU is available, the GPU will be used first by default unless prefer_cpu=False is configured.

Install DB-GPT Application Database

NOTE

You do not need to separately create the database tables related to the DB-GPT application in SQLite; they will be created automatically for you by default.

Test data (optional)

The DB-GPT project has a part of test data built-in by default, which can be loaded into the local database for testing through the following command

  • Linux
bash ./scripts/examples/load_examples.sh

  • Windows
.\scripts\examples\load_examples.bat

Run service

The DB-GPT service is packaged into a server, and the entire DB-GPT service can be started through the following command.

python dbgpt/app/dbgpt_server.py
NOTE

Run service

If you are running version v0.4.3 or earlier, please start with the following command:

python pilot/server/dbgpt_server.py

Run DB-GPT with command dbgpt

If you want to run DB-GPT with the command dbgpt:

dbgpt start webserver

Visit website

Open the browser and visit http://localhost:5670