Knowledge Chat API
The Knowledge Chat API provides a conversational AI interface based on RAG (Retrieval-Augmented Generation). Since it is compatible with the OpenAI Chat Completions API, you can directly use existing SDKs, libraries, and tools from the OpenAI ecosystem.
This API follows the OpenAI Chat Completions API specification. You can use it directly with the OpenAI Python/JS SDK, LangChain, LlamaIndex, etc. by simply changing the base_url.
The Knowledge Chat API is served at the /v1 path of the Knowledge Builder service. It uses a different Base URL from the Manager API (/api/v1).
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions | Chat Completion (streaming/non-streaming) |
| GET | /v1/models | Available LLM model list |
Custom Headers
The Knowledge Chat API supports custom headers for controlling RAG search scope and strategy.
| Header | Required | Description |
|---|---|---|
X-Knowledge-Id | Yes | Knowledge ID for RAG search target |
X-Document-Ids | No | Document IDs to restrict search scope (comma-separated) |
X-RAG-Strategy | No | RAG strategy: vector, text, hybrid, agentic |
POST /v1/chat/completions
Performs Knowledge-based RAG Chat Completion.
Request Body
| Field | Type | Default | Description |
|---|---|---|---|
model | string | - | LLM model name (e.g., gpt-4o-mini) |
messages | array | - | Conversation message array |
temperature | float | 0.7 | Generation temperature (0.0–2.0) |
top_p | float | 1.0 | Top-p sampling (0.0–1.0) |
max_tokens | integer | null | Maximum generation tokens |
stream | boolean | false | Enable SSE streaming |
stop | array|string | null | Stop sequences |
Each item in the messages array:
| Field | Type | Description |
|---|---|---|
role | string | Role: system, user, assistant |
content | string | Message content |
Non-Streaming Response
200 OK
{
"id": "chatcmpl-1718448000000",
"object": "chat.completion",
"created": 1718448000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "To create a pipeline in D.Hub..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 256,
"completion_tokens": 128,
"total_tokens": 384
}
}
Streaming Response
When stream: true is set, the response is in Server-Sent Events (SSE) format.
Content-Type: text/event-stream
data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":"D.Hub"},"finish_reason":null}]}
data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":" allows"},"finish_reason":null}]}
data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
GET /v1/models
Retrieves the list of available models from the configured LLM providers.
Response
200 OK
{
"object": "list",
"data": [
{
"id": "gpt-4o-mini",
"object": "model",
"created": 1718448000,
"owned_by": "openai"
},
{
"id": "gpt-4o",
"object": "model",
"created": 1718448000,
"owned_by": "openai"
}
]
}
Usage Examples
cURL (Non-Streaming)
curl -X POST https://api.dhub.io/v1/chat/completions \
-H "Authorization: Bearer <access_token>" \
-H "Content-Type: application/json" \
-H "X-Knowledge-Id: knowledge-abc123" \
-H "X-RAG-Strategy: hybrid" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a D.Hub assistant."},
{"role": "user", "content": "How do I create a pipeline?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'
cURL (Streaming)
curl -X POST https://api.dhub.io/v1/chat/completions \
-H "Authorization: Bearer <access_token>" \
-H "Content-Type: application/json" \
-H "X-Knowledge-Id: knowledge-abc123" \
-H "X-Document-Ids: doc-001,doc-002" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "What is a dataset schema?"}
],
"stream": true
}'
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.dhub.io/v1",
api_key="<access_token>",
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a D.Hub assistant."},
{"role": "user", "content": "Explain the search modes of Knowledge Builder."},
],
temperature=0.7,
max_tokens=1024,
extra_headers={
"X-Knowledge-Id": "knowledge-abc123",
"X-RAG-Strategy": "hybrid",
},
)
print(response.choices[0].message.content)
Python (Streaming)
from openai import OpenAI
client = OpenAI(
base_url="https://api.dhub.io/v1",
api_key="<access_token>",
)
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Summarize the main features of D.Hub."},
],
stream=True,
extra_headers={
"X-Knowledge-Id": "knowledge-abc123",
},
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
- vector: Semantic similarity search. Suitable for natural language questions.
- text: BM25 keyword search. Suitable for exact terms or code search.
- hybrid: Combines vector + text. Recommended for most situations.
- agentic: An LLM agent automatically determines the search strategy. Useful for complex queries.