Skip to main content

Pipelines API

A Pipeline is a workflow that defines data processing flows. It consists of nodes (Dataset, Code) and edges (connections).

1. Create Pipeline

Creates a new pipeline.

Request

POST /pipelines/

Body Schema (Pipeline)

FieldTypeRequiredDescription
namestringYesPipeline name
stepsarray[PipelineStep]YesList of pipeline configuration steps (nodes)
optionsobjectNoExecution options (scheduling, etc.)

PipelineStep Object

FieldTypeDescription
namestringStep name (node ID)
scriptstringScript or command to execute
inputsmap[string, PipelineData]Input data connection information
outputsmap[string, PipelineData]Output data connection information

Example

{
"name": "daily_etl_pipeline",
"steps": [
{
"name": "read_source",
"script": "read_csv",
"outputs": {
"out": { "dataset": "source_dataset_id" }
}
},
{
"name": "transform",
"script": "python_script_id",
"inputs": {
"in": { "dataset": "source_dataset_id" }
},
"outputs": {
"result": { "dataset": "target_dataset_id" }
}
}
]
}

2. Update Pipeline

Updates pipeline configuration.

Request

PUT /pipelines/{pipeline_id}

Body Schema (PipelineUpdate)

FieldTypeDescription
stepsarray[PipelineStep]Redefine entire steps (nodes)
optionsobjectOptions change
metadataobjectMetadata such as UI layout information

3. Delete Pipeline

Deletes a pipeline. May fail if there are running batches.

Request

DELETE /pipelines/{pipeline_id}