Pipelines API
A Pipeline is a workflow that defines data processing flows. It consists of nodes (Dataset, Code) and edges (connections).
1. Create Pipeline
Creates a new pipeline.
Request
POST /pipelines/
Body Schema (Pipeline)
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Pipeline name |
steps | array[PipelineStep] | Yes | List of pipeline configuration steps (nodes) |
options | object | No | Execution options (scheduling, etc.) |
PipelineStep Object
| Field | Type | Description |
|---|---|---|
name | string | Step name (node ID) |
script | string | Script or command to execute |
inputs | map[string, PipelineData] | Input data connection information |
outputs | map[string, PipelineData] | Output data connection information |
Example
{
"name": "daily_etl_pipeline",
"steps": [
{
"name": "read_source",
"script": "read_csv",
"outputs": {
"out": { "dataset": "source_dataset_id" }
}
},
{
"name": "transform",
"script": "python_script_id",
"inputs": {
"in": { "dataset": "source_dataset_id" }
},
"outputs": {
"result": { "dataset": "target_dataset_id" }
}
}
]
}
2. Update Pipeline
Updates pipeline configuration.
Request
PUT /pipelines/{pipeline_id}
Body Schema (PipelineUpdate)
| Field | Type | Description |
|---|---|---|
steps | array[PipelineStep] | Redefine entire steps (nodes) |
options | object | Options change |
metadata | object | Metadata such as UI layout information |
3. Delete Pipeline
Deletes a pipeline. May fail if there are running batches.
Request
DELETE /pipelines/{pipeline_id}