Version: v0.7.4

Observability

Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. In the context of a software system, observability is the ability to understand the internal state of the system by examining its outputs. This is important for debugging, monitoring, and maintaining the system.

Observability In DB-GPT

DB-GPT provides observability through the following mechanisms:

Logging: DB-GPT logs various events and metrics to help you understand the internal state of the system.
Tracing: DB-GPT provides tracing capabilities to help you understand the flow of requests through the system.

Logging

You can configure the logging level and storage location for DB-GPT logs. By default, logs are stored in the logs directory in the DB-GPT root directory. You can change the log level and storage location by setting the DBGPT_LOG_LEVEL and DBGPT_LOG_DIR environment.

Tracing

DB-GPT has built-in tracing capabilities that allow you to trace the flow of requests through the system.

Trace Storage

Local Storage

DB-GPT will store traces in the traces directory in the DB-GPT logs directory, by default, they are located in logs/dbgpt*.jsonl.

If you want to know more about the local storage of traces and how to use them, you can refer to the Debugging documentation.

OpenTelemetry Support

DB-GPT also supports OpenTelemetry for distributed tracing. Now, you can export traces to open-telemetry compatible backends like Jaeger, Zipkin, and others with OpenTelemetry Protocol (OTLP).

To enable OpenTelemetry support, you need install following packages:

pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

Then, modify your .env file to enable OpenTelemetry tracing:

## Whether to enable DB-GPT send trace to OpenTelemetry
TRACER_TO_OPEN_TELEMETRY=True
## More details see https://opentelemetry-python.readthedocs.io/en/latest/exporter/otlp/otlp.html
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317

In the above configuration, you can change the OTEL_EXPORTER_OTLP_TRACES_ENDPOINT to your OTLP collector or backend, we use gRPC endpoint by default.

Here, we use Jaeger as an example to show how to use OpenTelemetry to trace DB-GPT.

Jaeger Support

Here is an example of how to use Jaeger to trace DB-GPT with docker:

Run the Jaeger all-in-one image:

docker run --rm --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 14250:14250 \
  -p 14268:14268 \
  -p 14269:14269 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.58

Then, modify your .env file to enable OpenTelemetry tracing like above.

TRACER_TO_OPEN_TELEMETRY=True
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317

Start the DB-GPT server:

dbgpt start webserver

Now, you can access the Jaeger UI at http://localhost:16686 to view the traces.

Here are some examples of screenshot of Jaeger UI:

Search Traces Page

Show Normal Conversation Trace

Show Conversation Detail Tags

Show Agent Conversation Trace

Show Trace In Cluster

Jaeger Support With Docker Compose

If you want to use docker-compose to start DB-GPT and Jaeger, you can use the following docker-compose.yml file:

# An example of using docker-compose to start a cluster with observability enabled.
version: '3.10'

services:
  jaeger:
    image: jaegertracing/all-in-one:1.58
    restart: unless-stopped
    networks:
      - dbgptnet
    ports:
      # serve frontend
      - "16686:16686"
      # accept jaeger.thrift over Thrift-compact protocol (used by most SDKs)
      - "6831:6831"
      # accept OpenTelemetry Protocol (OTLP) over HTTP
      - "4318:4318"
      # accept OpenTelemetry Protocol (OTLP) over gRPC
      - "4317:4317"
      - "14268:14268"
    environment:
      - LOG_LEVEL=debug
      - SPAN_STORAGE_TYPE=badger
      - BADGER_EPHEMERAL=false
      - BADGER_DIRECTORY_VALUE=/badger/data
      - BADGER_DIRECTORY_KEY=/badger/key
    volumes:
      - jaeger-badger:/badger
    user: root
  controller:
    image: eosphorosai/dbgpt:latest
    command: dbgpt start controller
    restart: unless-stopped
    environment:
      - TRACER_TO_OPEN_TELEMETRY=True
      - OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://jaeger:4317
      - DBGPT_LOG_LEVEL=DEBUG
    networks:
      - dbgptnet
  llm-worker:
    image: eosphorosai/dbgpt:latest
    command: dbgpt start worker --model_type proxy --model_name chatgpt_proxyllm --model_path chatgpt_proxyllm --proxy_server_url ${OPENAI_API_BASE}/chat/completions --proxy_api_key ${OPENAI_API_KEY} --controller_addr http://controller:8000
    environment:
      # Your real openai model name, e.g. gpt-3.5-turbo, gpt-4o
      - PROXYLLM_BACKEND=gpt-3.5-turbo
      - TRACER_TO_OPEN_TELEMETRY=True
      - OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://jaeger:4317
      - DBGPT_LOG_LEVEL=DEBUG
    depends_on:
      - controller
    restart: unless-stopped
    networks:
      - dbgptnet
    ipc: host
  embedding-worker:
    image: eosphorosai/dbgpt:latest
    command: dbgpt start worker --worker_type text2vec --model_name proxy_http_openapi --model_path proxy_http_openapi --proxy_server_url ${OPENAI_API_BASE}/embeddings --proxy_api_key ${OPENAI_API_KEY} --controller_addr http://controller:8000
    environment:
      - proxy_http_openapi_proxy_backend=text-embedding-3-small
      - TRACER_TO_OPEN_TELEMETRY=True
      - OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://jaeger:4317
      - DBGPT_LOG_LEVEL=DEBUG
    depends_on:
      - controller
    restart: unless-stopped
    networks:
      - dbgptnet
    ipc: host
  webserver:
    image: eosphorosai/dbgpt:latest
    command: dbgpt start webserver --light --remote_embedding --controller_addr http://controller:8000
    environment:
      - LLM_MODEL=chatgpt_proxyllm
      - EMBEDDING_MODEL=proxy_http_openapi
      - TRACER_TO_OPEN_TELEMETRY=True
      - OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://jaeger:4317
    depends_on:
      - controller
      - llm-worker
      - embedding-worker
    volumes:
      - dbgpt-data:/app/pilot/data
      - dbgpt-message:/app/pilot/message
    ports:
      - 5670:5670/tcp
    restart: unless-stopped
    networks:
      - dbgptnet
volumes:
  dbgpt-data:
  dbgpt-message:
  jaeger-badger:
networks:
  dbgptnet:
    driver: bridge
    name: dbgptnet

You can start the cluster with the following command:

OPENAI_API_KEY="{your api key}" OPENAI_API_BASE="https://api.openai.com/v1" docker compose up -d

Please replace {your api key} with your real OpenAI API key and https://api.openai.com/v1 with your real OpenAI API base URL. You can see more details about the docker-compose file in the docker/compose_examples/observability/docker-compose.yml documentation.

After the cluster is started, you can access the Jaeger UI at http://localhost:16686 to view the traces.

Show RAG Conversation Trace

In the above screenshot, you can see the trace of cross-service communication between the DB-GPT controller, LLM worker, and webserver.

Observability

Observability In DB-GPT​

Logging​

Tracing​

Trace Storage​

Local Storage​

OpenTelemetry Support​

Jaeger Support​

Jaeger Support With Docker Compose​