Version: dev

Knowledge Base

Build and manage knowledge bases for Retrieval-Augmented Generation (RAG). Upload documents, configure retrieval, and use them in chat.

Creating a knowledge base

Step 1 — Navigate to Knowledge

Click Knowledge in the sidebar to open the knowledge management page.

Step 2 — Create a new knowledge base

Click Create (or the + button)
Fill in:
- Name — A descriptive name for the knowledge base
- Description — Brief description of the content
- Embedding Model — The embedding model to use for vectorization (must match your configured embedding)
Click Create

Step 3 — Upload documents

Open the knowledge base you just created
Click Upload to add documents
Select one or more files
Wait for processing to complete (chunking, embedding, and indexing)

Supported file formats

Format	Extensions
Documents	`.pdf`, `.docx`, `.doc`, `.txt`, `.md`
Spreadsheets	`.xlsx`, `.xls`, `.csv`
Web	`.html`, `.htm`
Data	`.json`
Code	`.py`, `.java`, `.js`, `.ts`, etc.

Using a knowledge base in chat

Go to Chat and create a new conversation
Select Chat Knowledge mode
Choose your knowledge base from the dropdown
Ask questions — the LLM will use your documents as context

Knowledge base settings

Each knowledge base has configurable settings:

Setting	Description	Default
Chunk Size	Maximum characters per chunk	512
Chunk Overlap	Overlap between consecutive chunks	50
Top K	Number of chunks to retrieve per query	5
Score Threshold	Minimum relevance score for retrieval	0.3

Tuning retrieval

Large documents: Increase chunk size to preserve context
Precise answers: Increase Top K and lower the score threshold
Noisy results: Raise the score threshold

Storage types

DB-GPT supports multiple vector storage backends:

Backend	Description	Install Extra
ChromaDB	Default, embedded, no setup needed	`storage_chromadb`
Milvus	Distributed vector database for production	`storage_milvus`
OceanBase	Cloud-native distributed database	`storage_oceanbase`

To use a non-default backend, add the corresponding extra to your install command:

uv sync --all-packages --extra "storage_milvus" ...

Advanced features

Graph RAG

DB-GPT supports knowledge graphs for structured retrieval:

Extracts entities and relationships from documents
Enables graph-based queries alongside vector search
Useful for complex domain knowledge with interconnected concepts

See Graph RAG for setup instructions.

Keyword retrieval (BM25)

For hybrid retrieval combining vector and keyword search:

uv sync --all-packages --extra "rag_bm25" ...

This enables BM25 indexing alongside vector embeddings for improved recall.

Managing knowledge bases

Action	How
View	Click on a knowledge base to see its documents and settings
Add documents	Use the Upload button within the knowledge base
Delete documents	Select documents and click Delete
Delete knowledge base	Use the Delete button on the knowledge base card

Deleting is permanent

Deleting a knowledge base removes all associated vector embeddings and indexed data. The original uploaded files are not recoverable.

Next steps

Topic	Link
Use knowledge in chat	Chat
RAG concepts	RAG
Advanced RAG configuration	RAG Tutorial

Knowledge Base

Creating a knowledge base​

Step 1 — Navigate to Knowledge​

Step 2 — Create a new knowledge base​

Step 3 — Upload documents​

Using a knowledge base in chat​

Knowledge base settings​

Storage types​

Advanced features​

Managing knowledge bases​

Next steps​