Skip to main content
Version: v0.1.0

Knowledge Chat API

The Knowledge Chat API provides a conversational AI interface based on RAG (Retrieval-Augmented Generation). Since it is compatible with the OpenAI Chat Completions API, you can directly use existing SDKs, libraries, and tools from the OpenAI ecosystem.

OpenAI SDK Compatible

This API follows the OpenAI Chat Completions API specification. You can use it directly with the OpenAI Python/JS SDK, LangChain, LlamaIndex, etc. by simply changing the base_url.

Base URL

The Knowledge Chat API is served at the /v1 path of the Knowledge Builder service. It uses a different Base URL from the Manager API (/api/v1).

Endpoints

MethodPathDescription
POST/v1/chat/completionsChat Completion (streaming/non-streaming)
GET/v1/modelsAvailable LLM model list

Custom Headers

The Knowledge Chat API supports custom headers for controlling RAG search scope and strategy.

HeaderRequiredDescription
X-Knowledge-IdYesKnowledge ID for RAG search target
X-Document-IdsNoDocument IDs to restrict search scope (comma-separated)
X-RAG-StrategyNoRAG strategy: vector, text, hybrid, agentic

POST /v1/chat/completions

Performs Knowledge-based RAG Chat Completion.

Request Body

FieldTypeDefaultDescription
modelstring-LLM model name (e.g., gpt-4o-mini)
messagesarray-Conversation message array
temperaturefloat0.7Generation temperature (0.0–2.0)
top_pfloat1.0Top-p sampling (0.0–1.0)
max_tokensintegernullMaximum generation tokens
streambooleanfalseEnable SSE streaming
stoparray|stringnullStop sequences

Each item in the messages array:

FieldTypeDescription
rolestringRole: system, user, assistant
contentstringMessage content

Non-Streaming Response

200 OK

{
"id": "chatcmpl-1718448000000",
"object": "chat.completion",
"created": 1718448000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "To create a pipeline in D.Hub..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 256,
"completion_tokens": 128,
"total_tokens": 384
}
}

Streaming Response

When stream: true is set, the response is in Server-Sent Events (SSE) format.

Content-Type: text/event-stream

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":"D.Hub"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":" allows"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

GET /v1/models

Retrieves the list of available models from the configured LLM providers.

Response

200 OK

{
"object": "list",
"data": [
{
"id": "gpt-4o-mini",
"object": "model",
"created": 1718448000,
"owned_by": "openai"
},
{
"id": "gpt-4o",
"object": "model",
"created": 1718448000,
"owned_by": "openai"
}
]
}

Usage Examples

cURL (Non-Streaming)

curl -X POST https://api.dhub.io/v1/chat/completions \
-H "Authorization: Bearer <access_token>" \
-H "Content-Type: application/json" \
-H "X-Knowledge-Id: knowledge-abc123" \
-H "X-RAG-Strategy: hybrid" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a D.Hub assistant."},
{"role": "user", "content": "How do I create a pipeline?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'

cURL (Streaming)

curl -X POST https://api.dhub.io/v1/chat/completions \
-H "Authorization: Bearer <access_token>" \
-H "Content-Type: application/json" \
-H "X-Knowledge-Id: knowledge-abc123" \
-H "X-Document-Ids: doc-001,doc-002" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "What is a dataset schema?"}
],
"stream": true
}'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
base_url="https://api.dhub.io/v1",
api_key="<access_token>",
)

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a D.Hub assistant."},
{"role": "user", "content": "Explain the search modes of Knowledge Builder."},
],
temperature=0.7,
max_tokens=1024,
extra_headers={
"X-Knowledge-Id": "knowledge-abc123",
"X-RAG-Strategy": "hybrid",
},
)

print(response.choices[0].message.content)

Python (Streaming)

from openai import OpenAI

client = OpenAI(
base_url="https://api.dhub.io/v1",
api_key="<access_token>",
)

stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Summarize the main features of D.Hub."},
],
stream=True,
extra_headers={
"X-Knowledge-Id": "knowledge-abc123",
},
)

for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
RAG Strategy Selection
  • vector: Semantic similarity search. Suitable for natural language questions.
  • text: BM25 keyword search. Suitable for exact terms or code search.
  • hybrid: Combines vector + text. Recommended for most situations.
  • agentic: An LLM agent automatically determines the search strategy. Useful for complex queries.