Version: v0.1.0

Knowledge Chat API

The Knowledge Chat API provides a conversational AI interface based on RAG (Retrieval-Augmented Generation). Since it is compatible with the OpenAI Chat Completions API, you can directly use existing SDKs, libraries, and tools from the OpenAI ecosystem.

OpenAI SDK Compatible

This API follows the OpenAI Chat Completions API specification. You can use it directly with the OpenAI Python/JS SDK, LangChain, LlamaIndex, etc. by simply changing the base_url.

Base URL

The Knowledge Chat API is served at the /v1 path of the Knowledge Builder service. It uses a different Base URL from the Manager API (/api/v1).

Endpoints

Method	Path	Description
POST	`/v1/chat/completions`	Chat Completion (streaming/non-streaming)
GET	`/v1/models`	Available LLM model list

Custom Headers

The Knowledge Chat API supports custom headers for controlling RAG search scope and strategy.

Header	Required	Description
`X-Knowledge-Id`	Yes	Knowledge ID for RAG search target
`X-Document-Ids`	No	Document IDs to restrict search scope (comma-separated)
`X-RAG-Strategy`	No	RAG strategy: `vector`, `text`, `hybrid`, `agentic`

POST /v1/chat/completions

Performs Knowledge-based RAG Chat Completion.

Request Body

Field	Type	Default	Description
`model`	string	-	LLM model name (e.g., `gpt-4o-mini`)
`messages`	array	-	Conversation message array
`temperature`	float	0.7	Generation temperature (0.0–2.0)
`top_p`	float	1.0	Top-p sampling (0.0–1.0)
`max_tokens`	integer	null	Maximum generation tokens
`stream`	boolean	false	Enable SSE streaming
`stop`	array\|string	null	Stop sequences

Each item in the messages array:

Field	Type	Description
`role`	string	Role: `system`, `user`, `assistant`
`content`	string	Message content

Non-Streaming Response

200 OK

{
  "id": "chatcmpl-1718448000000",
  "object": "chat.completion",
  "created": 1718448000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "To create a pipeline in D.Hub..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 256,
    "completion_tokens": 128,
    "total_tokens": 384
  }
}

Streaming Response

When stream: true is set, the response is in Server-Sent Events (SSE) format.

Content-Type: text/event-stream

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":"D.Hub"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{"content":" allows"},"finish_reason":null}]}

data: {"id":"chatcmpl-1718448000000","object":"chat.completion.chunk","created":1718448000,"model":"gpt-4o-mini","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

GET /v1/models

Retrieves the list of available models from the configured LLM providers.

Response

200 OK

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o-mini",
      "object": "model",
      "created": 1718448000,
      "owned_by": "openai"
    },
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1718448000,
      "owned_by": "openai"
    }
  ]
}

Usage Examples

cURL (Non-Streaming)

curl -X POST https://api.dhub.io/v1/chat/completions \
  -H "Authorization: Bearer <access_token>" \
  -H "Content-Type: application/json" \
  -H "X-Knowledge-Id: knowledge-abc123" \
  -H "X-RAG-Strategy: hybrid" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a D.Hub assistant."},
      {"role": "user", "content": "How do I create a pipeline?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

cURL (Streaming)

curl -X POST https://api.dhub.io/v1/chat/completions \
  -H "Authorization: Bearer <access_token>" \
  -H "Content-Type: application/json" \
  -H "X-Knowledge-Id: knowledge-abc123" \
  -H "X-Document-Ids: doc-001,doc-002" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "What is a dataset schema?"}
    ],
    "stream": true
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.dhub.io/v1",
    api_key="<access_token>",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a D.Hub assistant."},
        {"role": "user", "content": "Explain the search modes of Knowledge Builder."},
    ],
    temperature=0.7,
    max_tokens=1024,
    extra_headers={
        "X-Knowledge-Id": "knowledge-abc123",
        "X-RAG-Strategy": "hybrid",
    },
)

print(response.choices[0].message.content)

Python (Streaming)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.dhub.io/v1",
    api_key="<access_token>",
)

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Summarize the main features of D.Hub."},
    ],
    stream=True,
    extra_headers={
        "X-Knowledge-Id": "knowledge-abc123",
    },
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

RAG Strategy Selection

vector: Semantic similarity search. Suitable for natural language questions.
text: BM25 keyword search. Suitable for exact terms or code search.
hybrid: Combines vector + text. Recommended for most situations.
agentic: An LLM agent automatically determines the search strategy. Useful for complex queries.

Endpoints​

Custom Headers​

POST /v1/chat/completions​

Request Body​

Non-Streaming Response​

Streaming Response​

GET /v1/models​

Response​

Usage Examples​

cURL (Non-Streaming)​

cURL (Streaming)​

Python (OpenAI SDK)​

Python (Streaming)​