Skip to main content
Version: v0.1.0

Tables API

API for data table querying and uploading.

Overview

The Tables API allows you to query data stored in D.Hub and upload new data. Data is stored in a version-controlled table format, and fast queries can be performed through the analytics engine.

Endpoints

POST /api/v1/tables/schema

Automatically infers the schema from a file to be uploaded.

Query Parameters

ParameterTypeDefaultDescription
formatstringparquetFile format (csv, parquet)

Request Body

Upload file in multipart/form-data format.

FieldTypeDescription
filesfile[]Files to infer schema from

Response

200 OK

{
"fields": [
{"name": "id", "type": "int64", "nullable": false},
{"name": "name", "type": "string", "nullable": true},
{"name": "price", "type": "double", "nullable": true},
{"name": "created_at", "type": "timestamp", "nullable": true}
]
}

GET /api/v1/tables/{table_id}/versions

Retrieves the version history of a table (Delta Lake versions).

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Response

200 OK

[
{
"version": 3,
"timestamp": "2024-01-20T10:30:00Z",
"operation": "WRITE",
"operationParameters": {"mode": "Append"},
"readVersion": 2
},
{
"version": 2,
"timestamp": "2024-01-19T15:00:00Z",
"operation": "WRITE",
"operationParameters": {"mode": "Append"},
"readVersion": 1
}
]

GET /api/v1/tables/{table_id}

Retrieves all data from a table.

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Query Parameters

ParameterTypeDefaultDescription
formatstringcsvResponse format (csv, json, parquet, arrow)

Response

Returns data in the specified format.

CSV (default)

id,name,price
1,Product A,10000
2,Product B,20000

JSON

[
{"id": 1, "name": "Product A", "price": 10000},
{"id": 2, "name": "Product B", "price": 20000}
]

POST /api/v1/tables/{table_id}/query

Executes an SQL query to retrieve data.

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Query Parameters

ParameterTypeDefaultDescription
formatstringjsonResponse format (csv, json, parquet, arrow)

Request Body

{
"query": "SELECT category, SUM(price) as total FROM sales GROUP BY category",
"limit": 1000
}
FieldTypeRequiredDescription
querystringYesSQL query (D.Hub SQL syntax)
limitintegerNoResult row count limit

Response

Returns query results in the specified format.


PUT /api/v1/tables/{table_id}/upload

Uploads data to a table.

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Query Parameters

ParameterTypeDefaultDescription
formatstringparquetFile format (csv, json, parquet)
modestringappendWrite mode (append, overwrite)

Request Body

Upload file in multipart/form-data format.

FieldTypeDescription
filesfile[]Data files to upload

Response

200 OK

{
"message": "Data inserted successfully"
}

Write Modes

ModeDescription
appendAppend to existing data
overwriteOverwrite existing data

POST /api/v1/tables/{table_id}/sink

Starts a data sink (streaming ingestion).

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Response

200 OK

{
"message": "Sink created successfully"
}

DELETE /api/v1/tables/{table_id}/sink

Stops and deletes a data sink.

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Response

200 OK

{
"message": "Sink deleted successfully"
}

GET /api/v1/tables/{table_id}/sink

Retrieves the status of a data sink.

Path Parameters

ParameterTypeRequiredDescription
table_idstringYesTable (dataset) ID

Response

200 OK

{
"states": {
"replica-0": "Running",
"replica-1": "Running"
}
}

Usage Examples

cURL

# Schema inference
curl -X POST https://api.dhub.io/api/v1/tables/schema?format=csv \
-H "Authorization: Bearer <access_token>" \
-F "files=@data.csv"

# Query table (JSON)
curl "https://api.dhub.io/api/v1/tables/dataset-abc123?format=json" \
-H "Authorization: Bearer <access_token>"

# Execute SQL query
curl -X POST https://api.dhub.io/api/v1/tables/dataset-abc123/query \
-H "Authorization: Bearer <access_token>" \
-H "Content-Type: application/json" \
-d '{"query": "SELECT * FROM sales WHERE amount > 1000", "limit": 100}'

# Upload data (CSV, append)
curl -X PUT "https://api.dhub.io/api/v1/tables/dataset-abc123/upload?format=csv&mode=append" \
-H "Authorization: Bearer <access_token>" \
-F "files=@new_data.csv"

Python

import requests
import pandas as pd

BASE_URL = "https://api.dhub.io/api/v1"
headers = {"Authorization": f"Bearer {access_token}"}

# Query data
response = requests.get(
f"{BASE_URL}/tables/dataset-abc123",
headers=headers,
params={"format": "json"}
)
df = pd.DataFrame(response.json())

# Execute SQL query
response = requests.post(
f"{BASE_URL}/tables/dataset-abc123/query",
headers=headers,
json={
"query": "SELECT category, SUM(amount) as total FROM sales GROUP BY category",
"limit": 1000
}
)
result = pd.DataFrame(response.json())

# Upload data
with open("data.parquet", "rb") as f:
response = requests.put(
f"{BASE_URL}/tables/dataset-abc123/upload",
headers=headers,
params={"format": "parquet", "mode": "append"},
files={"files": f}
)

Supported File Formats

FormatExtensionReadUploadDescription
CSV.csvOOText-based, highly versatile
JSON.jsonOOSupports nested structures
Parquet.parquetOOColumn-oriented, efficient compression
Arrow.arrowOXMemory-efficient transfer