Datasets Benchmark
Get started with the Benchmark API
Create Dataset Benchmark Task
POST /api/v2/serve/evaluate/execute_benchmark_task
DBGPT_API_KEY=dbgpt
SPACE_ID={YOUR_SPACE_ID}
curl -X POST "http://localhost:5670/api/v2/serve/evaluate/execute_benchmark_task" \
-H "Authorization: Bearer $DBGPT_API_KEY" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"scene_value": "Falcon_benchmark_01",
"model_list": ["DeepSeek-V3.1", "Qwen3-235B-A22B"]
}'
The Benchmark Request Object
scene_key string Required
The scene type of the evaluation, e.g. support app, recall
scene_value string Required
The scene value of the benchmark, e.g. The marking evaluation task name
model_list object Required
The model name list of the benchmark will execute, e.g. ["DeepSeek-V3.1","Qwen3-235B-A22B"] Notice: The model name configured on the db-gpt platform needs to be entered.
temperature float
The temperature of the llm model, Default is 0.7
max_tokens int
The max tokens of the llm model, Default is None
The Benchmark Result
status string
The benchmark status,e.g. success, failed, running