Knowledge Base
Build and manage knowledge bases for Retrieval-Augmented Generation (RAG). Upload documents, configure retrieval, and use them in chat.
Creating a knowledge baseâ
Step 1 â Navigate to Knowledgeâ
Click Knowledge in the sidebar to open the knowledge management page.
Step 2 â Create a new knowledge baseâ
- Click Create (or the + button)
- Fill in:
- Name â A descriptive name for the knowledge base
- Description â Brief description of the content
- Embedding Model â The embedding model to use for vectorization (must match your configured embedding)
- Click Create
Step 3 â Upload documentsâ
- Open the knowledge base you just created
- Click Upload to add documents
- Select one or more files
- Wait for processing to complete (chunking, embedding, and indexing)
Supported file formats
| Format | Extensions |
|---|---|
| Documents | .pdf, .docx, .doc, .txt, .md |
| Spreadsheets | .xlsx, .xls, .csv |
| Web | .html, .htm |
| Data | .json |
| Code | .py, .java, .js, .ts, etc. |
Using a knowledge base in chatâ
- Go to Chat and create a new conversation
- Select Chat Knowledge mode
- Choose your knowledge base from the dropdown
- Ask questions â the LLM will use your documents as context
Knowledge base settingsâ
Each knowledge base has configurable settings:
| Setting | Description | Default |
|---|---|---|
| Chunk Size | Maximum characters per chunk | 512 |
| Chunk Overlap | Overlap between consecutive chunks | 50 |
| Top K | Number of chunks to retrieve per query | 5 |
| Score Threshold | Minimum relevance score for retrieval | 0.3 |
Tuning retrieval
- Large documents: Increase chunk size to preserve context
- Precise answers: Increase Top K and lower the score threshold
- Noisy results: Raise the score threshold
Storage typesâ
DB-GPT supports multiple vector storage backends:
| Backend | Description | Install Extra |
|---|---|---|
| ChromaDB | Default, embedded, no setup needed | storage_chromadb |
| Milvus | Distributed vector database for production | storage_milvus |
| OceanBase | Cloud-native distributed database | storage_oceanbase |
To use a non-default backend, add the corresponding extra to your install command:
uv sync --all-packages --extra "storage_milvus" ...
Advanced featuresâ
Graph RAG
DB-GPT supports knowledge graphs for structured retrieval:
- Extracts entities and relationships from documents
- Enables graph-based queries alongside vector search
- Useful for complex domain knowledge with interconnected concepts
See Graph RAG for setup instructions.
Keyword retrieval (BM25)
For hybrid retrieval combining vector and keyword search:
uv sync --all-packages --extra "rag_bm25" ...
This enables BM25 indexing alongside vector embeddings for improved recall.
Managing knowledge basesâ
| Action | How |
|---|---|
| View | Click on a knowledge base to see its documents and settings |
| Add documents | Use the Upload button within the knowledge base |
| Delete documents | Select documents and click Delete |
| Delete knowledge base | Use the Delete button on the knowledge base card |
Deleting is permanent
Deleting a knowledge base removes all associated vector embeddings and indexed data. The original uploaded files are not recoverable.
Next stepsâ
| Topic | Link |
|---|---|
| Use knowledge in chat | Chat |
| RAG concepts | RAG |
| Advanced RAG configuration | RAG Tutorial |