Skip to main content

DB-GPT V0.7.0, MCP + DeepSeek R1: Bringing More Possibilities to LLM Applications

· 10 min read

DB-GPT V0.7.0 Release: MCP Protocol Support, DeepSeek R1 Model Integration, Complete Architecture Upgrade, GraphRAG Retrieval Chain Enhancement, and More..

DB-GPT is an open-source AI Native Data App Development framework with AWEL and Agents. In version V0.7.0, we have reorganized the DB-GPT module packages, splitting the original modules, restructuring the entire framework configuration system, and providing a clearer, more flexible, and more extensible management and development capability for building AI native data applications around large models.

V0.7.0 version mainly adds and enhances the following core features

🍀 Support for MCP(Model Context Protocol) protocol.

🍀 Integrated DeepSeek R1, QWQ inference models, all original DB-GPT Chat scenarios now cover deep thinking capabilities.

🍀 GraphRAG retrieval chain enhancement: support for "Vector" and "Intent Recognition+Text2GQL" graph retrievers.

🍀 DB-GPT module package restructuring, original dbgpt package split into dbgpt-core, dbgpt-ext, dbgpt-serve, dbgpt-client, dbgpt-acclerator, dbgpt-app.

🍀 Reconstructed DB-GPT configuration system, configuration files changed to ".toml" format, abolishing the original .env configuration logic.

✨New Features

1. Support for MCP(Model Context Protocol) protocol

Usage instructions:

a. Run the MCP SSE Server gateway:

npx -y supergateway --stdio "uvx mcp-server-fetch"

Here we are running the web scraping mcp-server-fetch

b. Create a Multi-agent+ Auto-Planning+ MCP web page scraping and summarization APP.

c. Configure the APP, select the ToolExpert and Summarizer agents, and add a resource of type tool(mcp(sse)) to ToolExpert, where mcp_servers should be filled with the service address started in step a (default is: http://127.0.0.1:8000/sse), then save the application.

d. Select the newly created MCP Web Fetch APP to chat, provide a webpage for the APP to summarize:

The example input question is: What does this webpage talk about https://www.cnblogs.com/fnng/p/18744210"

2. Integrated DeepSeek R1 inference model

And all Chat scenarios in original DB-GPT now have deep thinking capabilities.

For quick usage reference: http://docs.dbgpt.cn/docs/next/quickstart

Data analysis scenario:

Knowledge base scenario:

3. GraphRAG retrieval chain enhancement: support for "Vector" and "Intent Recognition+Text2GQL" graph retrievers.

  • "Vector" graph retriever

During the knowledge graph construction process, vectors are added to all nodes and edges and indexes are established. When querying, the question is vectorized and through TuGraph-DB's built-in vector indexing capability, based on the HNSW algorithm, topk related nodes and edges are queried. Compared to keyword graph retrieval, it can identify more ambiguous questions.

Configuration example:

[rag.storage.graph]
type = "TuGraph"
host="127.0.0.1"
port=7687
username="admin"
password="73@TuGraph"

enable_summary="True"
triplet_graph_enabled="True"
document_graph_enabled="True"

# Vector graph retrieval configuration items
enable_similarity_search="True"
knowledge_graph_embedding_batch_size=20
similarity_search_topk=5
extract_score_threshold=0.7
  • "Intent Recognition+Text2GQL" graph retriever

The question is rewritten through the intent recognition module, extracting true intent and involved entities and relationships, and then translated using the Text2GQL model into GQL statements for direct querying. It can perform more precise graph queries and display corresponding query statements. In addition to calling large model API services, you can also use ollama to call local Text2GQL models.

Configuration example:

[rag.storage.graph]
type = "TuGraph"
host="127.0.0.1"
port=7687
username="admin"
password="73@TuGraph"

enable_summary="True"
triplet_graph_enabled="True"
document_graph_enabled="True"

# Intent Recognition+Text2GQL graph retrieval configuration items
enable_text_search="True"

# Use Ollama to deploy independent text2gql model, enable the following configuration items
# text2gql_model_enabled="True"
# text2gql_model_name="tugraph/CodeLlama-7b-Cypher-hf:latest"

4. DB-GPT module package restructuring

Original dbgpt package split into dbgpt-core, dbgpt-ext, dbgpt-serve, dbgpt-client, dbgpt-acclerator, dbgpt-app

As dbgpt has gradually developed, the service modules have increased, making functional regression testing difficult and compatibility issues more frequent. Therefore, the original dbgpt content has been modularized:

  • dbgpt-core: Mainly responsible for core module interface definitions of dbgpt's awel, model, agent, rag, storage, datasource, etc., releasing Python SDK.
  • dbgpt-ext: Mainly responsible for implementing dbgpt extension content, including datasource extensions, vector-storage, graph-storage extensions, and model access extensions, making it easier for community developers to quickly use and extend new module content, releasing Python SDK.
  • dbgpt-serve: Mainly provides Restful interfaces for dbgpt's atomized services of each module, making it easy for community users to quickly integrate. No Python SDK is released at this time.
  • dbgpt-app: Mainly responsible for business scenario implementations such as App, ChatData, ChatKnowledge, ChatExcel, Dashboard, etc., with no Python SDK.
  • dbgpt-client: Provides a unified Python SDK client for integration.
  • dbgpt-accelerator: Model inference acceleration module, including compatibility and adaptation for different versions (different torch versions, etc.), platforms (Windows, MacOS, and Linux), hardware environments (CPU, CUDA, and ROCM), inference frameworks (vLLM, llama.cpp), quantization methods (AWQ, bitsandbytes, GPTQ), and other acceleration modules (accelerate, flash-attn), providing cross-platform, installable underlying environments on-demand for other DB-GPT modules.

5. Restructured DB-GPT configuration system

The new configuration files using ".toml" format, abolishing the original .env configuration logic, each module can have its own configuration class, and automatically generate front-end configuration pages.

For quick usage reference: http://docs.dbgpt.cn/docs/next/quickstart

For all configurations reference: http://docs.dbgpt.cn/docs/next/config-reference/app/config_chatdashboardconfig_2480d0

[system]
# Load language from environment variable(It is set by the hook)
language = "${env:DBGPT_LANG:-zh}"
api_keys = []
encrypt_key = "your_secret_key"

# Server Configurations
[service.web]
host = "0.0.0.0"
port = 5670

[service.web.database]
type = "sqlite"
path = "pilot/meta_data/dbgpt.db"
[service.model.worker]
host = "127.0.0.1"

[rag.storage]
[rag.storage.vector]
type = "chroma"
persist_path = "pilot/data"

# Model Configurations
[models]
[[models.llms]]
name = "deepseek-reasoner"
# name = "deepseek-chat"
provider = "proxy/deepseek"
api_key = "your_deepseek_api_key"

6. Support for S3, OSS storage

DB-GPT unified storage extension OSS and S3 implementation, where the S3 implementation supports most cloud storage compatible with the S3 protocol. DB-GPT knowledge base original files, Chat Excel related intermediate files, AWEL Flow node parameter files, etc. all support cloud storage.

Configuration example:

[[serves]]
type = "file"
# Default backend for file server
default_backend = "s3"

[[serves.backends]]
type = "oss"
endpoint = "https://oss-cn-beijing.aliyuncs.com"
region = "oss-cn-beijing"
access_key_id = "${env:OSS_ACCESS_KEY_ID}"
access_key_secret = "${env:OSS_ACCESS_KEY_SECRET}"
fixed_bucket = "{your_bucket_name}"

[[serves.backends]]
# Use Tencent COS s3 compatible API as the file server
type = "s3"
endpoint = "https://cos.ap-beijing.myqcloud.com"
region = "ap-beijing"
access_key_id = "${env:COS_SECRETID}"
access_key_secret = "${env:COS_SECRETKEY}"
fixed_bucket = "{your_bucket_name}

For detailed configuration instructions, please refer to: http://docs.dbgpt.cn/docs/next/config-reference/utils/config_s3storageconfig_f0cdc9

7. Production-level llama.cpp inference support

Based on llama.cpp HTTP Server, supporting continuous batching, multi-user parallel inference, etc., llama.cpp inference moves towards production systems.

Configuration example:

# Model Configurations
[models]
[[models.llms]]
name = "DeepSeek-R1-Distill-Qwen-1.5B"
provider = "llama.cpp.server"
# If not provided, the model will be downloaded from the Hugging Face model hub
# uncomment the following line to specify the model path in the local file system
# https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
# path = "the-model-path-in-the-local-file-system"
path = "models/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf

8. Multi-model deployment persistence

Currently, most models can be integrated on the DB-GPT page, with configuration information persistently saved and models automatically loaded when the service starts.

9. LLM, Embedding, Reranker extension capability enhancement

Optimized the model extension approach, requiring only a few lines of code to integrate new models.

10. Native scenario support for conversation-round and token-based memory, with independent configuration support for each scenario

Configuration example:

[app]
# Unified temperature configuration for all scenarios
temperature = 0.6

[[app.configs]]
name = "chat_excel"
# Use custom temperature configuration
temperature = 0.1
duckdb_extensions_dir = []
force_install = true

[[app.configs]]
name = "chat_normal"
memory = {type="token", max_token_limit=20000}

[[app.configs]]
name = "chat_with_db_qa"
schema_retrieve_top_k = 50
memory = {type="window", keep_start_rounds=0, keep_end_rounds=10}

11. Chat Excel, Chat Data & Chat DB and Chat Dashboard native scenario optimization

  • Chat Data, Chat Dashboard support for streaming output.
  • Optimization of library table field knowledge processing and recall
  • Chat Excel optimization, supporting more complex table understanding and chart conversations, even small parameter-scale open-source LLMs can handle it well.

12. Front-end page support for LaTeX mathematical formula rendering

13. AWEL Flow support for simple conversation templates

14. Support for lightweight Docker images containing only proxy models (arm64 & amd64)

One-click deployment command for DB-GPT:

docker run -it --rm -e SILICONFLOW_API_KEY=${SILICONFLOW_API_KEY} \
-p 5670:5670 --name dbgpt eosphorosai/dbgpt-openai

You can also use the build script to build your own image:

bash docker/base/build_image.sh --install-mode openai

For details, see the documentation: http://docs.dbgpt.cn/docs/next/installation/docker-build-guide

15. DB-GPT API compatible with OpenAI SDK

from openai import OpenAI

DBGPT_API_KEY = "dbgpt"

client = OpenAI(
api_key=DBGPT_API_KEY,
base_url="http://localhost:5670/api/v2",
)

messages = [
{
"role": "user",
"content": "Hello, how are you?",
},
]

has_thinking = False
reasoning_content = ""
for chunk in client.chat.completions.create(
model="deepseek-chat",
messages=messages,
extra_body={
"chat_mode": "chat_normal",
},
stream=True,
max_tokens=4096,
):
delta_content = chunk.choices[0].delta.content
if hasattr(chunk.choices[0].delta, "reasoning_content"):
reasoning_content = chunk.choices[0].delta.reasoning_content
if reasoning_content:
if not has_thinking:
print("<thinking>", flush=True)
print(reasoning_content, end="", flush=True)
has_thinking = True
if delta_content:
if has_thinking:
print("</thinking>", flush=True)
print(delta_content, end="", flush=True)
has_thinking = False

16. Data source extension capability enhancement

After the backend supports new data sources, the frontend can automatically identify and dynamically configure them.

17. Agent resource support for dynamic parameter configuration

Frontend automatically identifies resource configuration parameters while remaining compatible with old configurations.

18. ReAct Agent support, Agent tool calling capability enhancement

19. IndexStore extension capability enhancement

IndexStore configuration restructuring, new storage implementations automatically scanned and discovered

20. AWEL flow compatibility enhancement

Cross-version compatibility for AWEL flow based on multi-version metadata.

🐞 Bug Fixes

Chroma support for Chinese knowledge base spaces, AWEL Flow issue fixes, fixed multi-platform Lyric installation error issues and local embedding model error issues, along with 40+ other bugs.

🛠️Others

Support for Ruff code formatting, multi-version documentation building, unit test fixes, and 20+ other issue fixes or feature enhancements.

Upgrade Guide:

1. Metadata database upgrade

For SQLite upgrades, table structures will be automatically upgraded by default. For MySQL upgrades, DDL needs to be executed manually. The assets/schema/dbgpt.sql file contains the complete DDL for the current version. Specific version change DDLs can be found in the assets/schema/upgrade directory. For example, if you are upgrading from v0.6.3 to v0.7.0, you can execute the following DDL:

mysql -h127.0.0.1 -uroot -p{your_password} < ./assets/schema/upgrade/v0_7_0/upgrade_to_v0.7.0.sql

2. Vector database upgrade

Due to underlying changes in Chroma storage in v0.7.0, version 0.7.0 does not support reading content from older versions. Please re-import knowledge bases and refresh data sources. Other vector storage solutions are not affected.

✨Official Documentation

English

Overview | DB-GPT

Chinese

概览

✨Acknowledgements

Thanks to all contributors for making this release possible!

283569391@qq.com, @15089677014, @Aries-ckt, @FOkvj, @Jant1L, @SonglinLyu, @TenYearOldJAVA, @Weaxs, @cinjoseph, @csunny, @damonqin, @dusx1981, @fangyinc, @geebytes, @haawha, @utopia2077, @vnicers, @xuxl2024, @yhjun1026, @yunfeng1993, @yyhhyyyyyy and tam

This version took nearly three months to develop and has been merged to the main branch for over a month. Hundreds of users participated in testing version 0.7.0, with GitHub receiving hundreds of issues feedback. Some users directly submitted PR fixes. The DB-GPT community sincerely thanks every user and contributor who participated in version 0.7.0!

✨Appendix

DB-GPT V0.6.0, Defining new standards for AI-native data applications.

· 4 min read

Introduction

DB-GPT is an open source AI native data application development framework with AWEL and agents. In the V0.6.0 version, we further provide flexible and scalable AI native data application management and development capabilities around large models, which can help enterprises quickly build and deploy intelligent AI data applications, and achieve enterprise digital transformation and business growth through intelligent data analysis, insights and decisions

The V0.6.0 version mainly adds and enhances the following core features

  • AWEL protocol upgrade 2.0, supporting more complex orchestration

  • Supports the creation and lifecycle management of data applications, and supports multiple application construction modes, such as: multi-agent automatic planning mode, task flow orchestration mode, single agent mode, and native application mode

  • GraphRAG supports graph community summary and hybrid retrieval, and the graph index cost is reduced by 50% compared to Microsoft GraphRAG.

  • Supports multiple Agent Memories, such as perceptual memory, short-term memory, long-term memory, hybrid memory, etc.

  • Supports intent recognition and prompt management, and newly added support for Text2NLU and Text2GQL fine-tuning

  • GPT-Vis front-end visualization upgrade to support richer visualization charts

Features

AWEL protocol upgrade 2.0 supports more complex orchestration and optimizes front-end visualization and interaction capabilities.

AWEL (Agentic Workflow Expression Language) is an agent-based workflow expression language designed specifically for large model application development, providing powerful functions and flexibility. Through the AWEL API, developers can focus on large model application logic development without having to pay attention to cumbersome model, environment and other details. In AWEL2.0, we support more complex orchestration and visualization

Supports the creation and life cycle management of data applications, and supports multiple modes to build applications, such as: multi-agent automatic planning mode, task flow orchestration mode, single agent mode, and native application mode

GraphRAG supports graph community summarization and hybrid retrieval.

The graph construction and retrieval performance have obvious advantages compared to community solutions, and it supports cool visualization. GraphRAG is an enhanced retrieval generation system based on knowledge graphs. Through the construction and retrieval of knowledge graphs, it further enhances the accuracy of retrieval and the stability of recall, while reducing the illusion of large models and enhancing the effects of domain applications. DB-GPT combines with TuGraph to build efficient retrieval enhancement generation capabilities

Based on the universal RAG framework launched in DB-GPT version 0.5.6 that integrates vector index, graph index, and full-text index, DB-GPT version 0.6.0 has enhanced the capabilities of graph index (GraphRAG) to support graph community summary and hybrid retrieval. ability. In the new version, we introduced TuGraph’s built-in Leiden community discovery algorithm, combined with large models to extract community subgraph summaries, and finally used similarity recall of community summaries to cope with generalized questioning scenarios, namely QFS (Query Focused Summarization). question. In addition, in the knowledge extraction stage, we upgraded the original triple extraction to graph extraction with point edge information summary, and optimized cross-text block associated information extraction through text block history to further enhance the information density of the knowledge graph.

Based on the above design, we used the open source knowledge graph corpus (OSGraph) provided by the TuGraph community and the product introduction materials of DB-GPT and TuGraph (about 43k tokens in total), and conducted comparative tests with Microsoft's GraphRAG system. Finally, DB-GPT It only consumes 50% of the token overhead and generates a knowledge graph of the same scale. And on the premise that the quality of the question and answer test is equivalent, the global search performance has been significantly improved.

For the final generated knowledge graph, we used AntV's G6 engine to upgrade the front-end rendering logic, which can intuitively preview the knowledge graph data and community segmentation results.

GPT-Vis: GPT-Vis is an interactive visualization solution for LLM and data, supporting rich visual chart display and intelligent recommendations

Text2GQL and Text2NLU fine-tuning: Newly supports fine-tuning from natural language to graph language, as well as fine-tuning for semantic classification.

Acknowledgements

This iteration is inseparable from the participation of developers and users in the community, and it also further cooperates with the TuGraph and AntV communities. Thanks to all the contributors who made this release possible!

@Aries-ckt, @Dreammy23, @Hec-gitHub, @JxQg, @KingSkyLi, @M1n9X, @bigcash, @chaplinthink, @csunny, @dusens, @fangyinc, @huangjh131, @hustcc, @lhwan, @whyuds and @yhjun1026

Reference

DB-GPT Now Supports Meta Llama 3.1 Series Models

· 2 min read
Fangyin Cheng
DB-GPT Core Team

We are thrilled to announce that DB-GPT now supports inference with the Meta Llama 3.1 series models!

Introducing Meta Llama 3.1

Meta Llama 3.1 is a state-of-the-art series of language models developed by Meta AI. Designed with cutting-edge techniques, the Llama 3.1 models offer unparalleled performance and versatility. Here are some of the key highlights:

  • Variety of Models: Meta Llama 3.1 is available in 8B, 70B, and 405B versions, each with both instruction-tuned and base models, supporting contexts up to 128k tokens.
  • Multilingual Support: Supports 8 languages, including English, German, and French.
  • Extensive Training: Trained on over 1.5 trillion tokens, utilizing 250 million human and synthetic samples for fine-tuning.
  • Flexible Licensing: Permissive model output usage allows for adaptation into other large language models (LLMs).
  • Quantization Support: Available in FP8, AWQ, and GPTQ quantized versions for efficient inference.
  • Performance: The Llama 3 405B version has outperformed GPT-4 in several benchmarks.
  • Enhanced Efficiency: The 8B and 70B models have seen a 12% improvement in coding and instruction-following capabilities.
  • Tool and Function Call Support: Supports tool usage and function calling.

How to Access Meta Llama 3.1

Your can access the Meta Llama 3.1 models according to Access to Hugging Face.

For comprehensive documentation and additional details, please refer to the model card.

Using Meta Llama 3.1 in DB-GPT

Please read the Source Code Deployment to learn how to install DB-GPT from source code.

Llama 3.1 needs upgrade your transformers >= 4.43.0, please upgrade your transformers:

pip install --upgrade "transformers>=4.43.0"

Please cd to the DB-GPT root directory:

cd DB-GPT

We assume that your models are stored in the models directory, e.g., models/Meta-Llama-3.1-8B-Instruct.

Then modify your .env file:

LLM_MODEL=meta-llama-3.1-8b-instruct
# LLM_MODEL=meta-llama-3.1-70b-instruct
# LLM_MODEL=meta-llama-3.1-405b-instruct
## you can also specify the model path
# LLM_MODEL_PATH=models/Meta-Llama-3.1-8B-Instruct
## Quantization settings
# QUANTIZE_8bit=False
# QUANTIZE_4bit=True
## You can configure the maximum memory used by each GPU.
# MAX_GPU_MEMORY=16Gib

Then you can run the following command to start the server:

dbgpt start webserver

Open your browser and visit http://localhost:5670 to use the Meta Llama 3.1 models in DB-GPT.

Enjoy the power of Meta Llama 3.1 in DB-GPT!