Tables API

데이터 테이블 조회 및 업로드를 위한 API입니다.

개요

Tables API를 통해 D.Hub에 저장된 데이터를 조회하고, 새로운 데이터를 업로드할 수 있습니다. 데이터는 Delta Lake 형식으로 저장되며, ClickHouse를 통해 빠른 분석 쿼리를 수행할 수 있습니다.

엔드포인트

POST /api/v1/tables/schema

업로드할 파일에서 스키마를 자동 추론합니다.

Query Parameters

Parameter	Type	Default	Description
`format`	string	parquet	파일 형식 (csv, parquet)

Request Body

multipart/form-data 형식으로 파일을 업로드합니다.

Field	Type	Description
`files`	file[]	스키마를 추론할 파일들

Response

200 OK

{
  "fields": [
    {"name": "id", "type": "int64", "nullable": false},
    {"name": "name", "type": "string", "nullable": true},
    {"name": "price", "type": "double", "nullable": true},
    {"name": "created_at", "type": "timestamp", "nullable": true}
  ]
}

GET /api/v1/tables/{table_id}/versions

테이블의 버전 히스토리를 조회합니다 (Delta Lake 버전).

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Response

200 OK

[
  {
    "version": 3,
    "timestamp": "2024-01-20T10:30:00Z",
    "operation": "WRITE",
    "operationParameters": {"mode": "Append"},
    "readVersion": 2
  },
  {
    "version": 2,
    "timestamp": "2024-01-19T15:00:00Z",
    "operation": "WRITE",
    "operationParameters": {"mode": "Append"},
    "readVersion": 1
  }
]

GET /api/v1/tables/{table_id}

테이블의 전체 데이터를 조회합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Query Parameters

Parameter	Type	Default	Description
`format`	string	csv	응답 형식 (csv, json, parquet, arrow)

Response

지정된 형식으로 데이터를 반환합니다.

CSV (기본)

id,name,price
1,Product A,10000
2,Product B,20000

JSON

[
  {"id": 1, "name": "Product A", "price": 10000},
  {"id": 2, "name": "Product B", "price": 20000}
]

POST /api/v1/tables/{table_id}/query

SQL 쿼리를 실행하여 데이터를 조회합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Query Parameters

Parameter	Type	Default	Description
`format`	string	json	응답 형식 (csv, json, parquet, arrow)

Request Body

{
  "query": "SELECT category, SUM(price) as total FROM sales GROUP BY category",
  "limit": 1000
}

Field	Type	Required	Description
`query`	string	Yes	SQL 쿼리 (ClickHouse 문법)
`limit`	integer	No	결과 행 수 제한

Response

지정된 형식으로 쿼리 결과를 반환합니다.

PUT /api/v1/tables/{table_id}/upload

테이블에 데이터를 업로드합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Query Parameters

Parameter	Type	Default	Description
`format`	string	parquet	파일 형식 (csv, json, parquet)
`mode`	string	append	쓰기 모드 (append, overwrite)

Request Body

multipart/form-data 형식으로 파일을 업로드합니다.

Field	Type	Description
`files`	file[]	업로드할 데이터 파일들

Response

200 OK

{
  "message": "Data inserted successfully"
}

쓰기 모드

Mode	Description
`append`	기존 데이터에 추가
`overwrite`	기존 데이터를 덮어씀

POST /api/v1/tables/{table_id}/sink

데이터 싱크(스트리밍 수집)를 시작합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Response

200 OK

{
  "message": "Sink created successfully"
}

DELETE /api/v1/tables/{table_id}/sink

데이터 싱크를 중지하고 삭제합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Response

200 OK

{
  "message": "Sink deleted successfully"
}

GET /api/v1/tables/{table_id}/sink

데이터 싱크의 상태를 조회합니다.

Path Parameters

Parameter	Type	Required	Description
`table_id`	string	Yes	테이블(데이터셋) ID

Response

200 OK

{
  "states": {
    "replica-0": "Running",
    "replica-1": "Running"
  }
}

사용 예시

cURL

# 스키마 추론
curl -X POST https://api.dhub.io/api/v1/tables/schema?format=csv \
  -H "Authorization: Bearer <access_token>" \
  -F "files=@data.csv"

# 테이블 조회 (JSON)
curl "https://api.dhub.io/api/v1/tables/dataset-abc123?format=json" \
  -H "Authorization: Bearer <access_token>"

# SQL 쿼리 실행
curl -X POST https://api.dhub.io/api/v1/tables/dataset-abc123/query \
  -H "Authorization: Bearer <access_token>" \
  -H "Content-Type: application/json" \
  -d '{"query": "SELECT * FROM sales WHERE amount > 1000", "limit": 100}'

# 데이터 업로드 (CSV, append)
curl -X PUT "https://api.dhub.io/api/v1/tables/dataset-abc123/upload?format=csv&mode=append" \
  -H "Authorization: Bearer <access_token>" \
  -F "files=@new_data.csv"

Python

import requests
import pandas as pd

BASE_URL = "https://api.dhub.io/api/v1"
headers = {"Authorization": f"Bearer {access_token}"}

# 데이터 조회
response = requests.get(
    f"{BASE_URL}/tables/dataset-abc123",
    headers=headers,
    params={"format": "json"}
)
df = pd.DataFrame(response.json())

# SQL 쿼리 실행
response = requests.post(
    f"{BASE_URL}/tables/dataset-abc123/query",
    headers=headers,
    json={
        "query": "SELECT category, SUM(amount) as total FROM sales GROUP BY category",
        "limit": 1000
    }
)
result = pd.DataFrame(response.json())

# 데이터 업로드
with open("data.parquet", "rb") as f:
    response = requests.put(
        f"{BASE_URL}/tables/dataset-abc123/upload",
        headers=headers,
        params={"format": "parquet", "mode": "append"},
        files={"files": f}
    )

지원 파일 형식

Format	Extension	Read	Upload	Description
CSV	.csv	O	O	텍스트 기반, 범용성 높음
JSON	.json	O	O	중첩 구조 지원
Parquet	.parquet	O	O	컬럼 지향, 압축 효율적
Arrow	.arrow	O	X	메모리 효율적 전송

개요​

엔드포인트​

POST /api/v1/tables/schema​

Query Parameters​

Request Body​

Response​

GET /api/v1/tables/{table_id}/versions​

Path Parameters​

Response​

GET /api/v1/tables/{table_id}​

Path Parameters​

Query Parameters​

Response​

POST /api/v1/tables/{table_id}/query​

Path Parameters​

Query Parameters​

Request Body​

Response​

PUT /api/v1/tables/{table_id}/upload​

Path Parameters​

Query Parameters​

Request Body​

Response​

쓰기 모드​

POST /api/v1/tables/{table_id}/sink​

Path Parameters​

Response​

DELETE /api/v1/tables/{table_id}/sink​

Path Parameters​

Response​

GET /api/v1/tables/{table_id}/sink​

Path Parameters​

Response​

사용 예시​

cURL​

Python​

지원 파일 형식​

개요

엔드포인트

POST /api/v1/tables/schema

Query Parameters

Request Body

Response

GET /api/v1/tables/{table_id}/versions

Path Parameters

Response

GET /api/v1/tables/{table_id}

Path Parameters

Query Parameters

Response

POST /api/v1/tables/{table_id}/query

Path Parameters

Query Parameters

Request Body

Response

PUT /api/v1/tables/{table_id}/upload

Path Parameters

Query Parameters

Request Body

Response

쓰기 모드

POST /api/v1/tables/{table_id}/sink

Path Parameters

Response

DELETE /api/v1/tables/{table_id}/sink

Path Parameters

Response

GET /api/v1/tables/{table_id}/sink

Path Parameters

Response

사용 예시

cURL

Python

지원 파일 형식