Pipelines (Collection)
Within a collection, you can check the configuration and execution status of pipelines, and navigate to the pipeline editor to make modifications when needed.
Pipeline Overview
A pipeline is a resource that defines a data processing workflow. It consists of multiple Steps, where each step references a code artifact and specifies input/output datasets.
Including a pipeline in a collection allows you to group it together with related datasets and code as a single unit of work.
Detailed editing and node configuration of pipelines is performed in the dedicated Pipeline Editor. The collection view focuses on reviewing pipeline summary information, data, and versions.
Pipeline View in Collections
When you select a pipeline item in the collection tree, the following tabs are displayed in the right panel.
Overview Tab
View the pipeline's basic information and step configuration.
| Field | Description |
|---|---|
| Name | Unique identifier for the pipeline |
| Alias | Display name shown to users |
| Type | Pipeline type |
| Tags | Tag list for search and classification |
| Steps | Number of steps that make up the pipeline |
| Comment | Description of the pipeline |
Each pipeline consists of one or more steps. You can view summary information for each step.
- Step Name: Identifier for each execution stage
- Script: Code artifact to be executed in the step
- Inputs/Outputs: Input and output datasets
- Dependencies: Dependencies on other steps
Data Tab
View the pipeline's input/output data. You can directly query the contents of datasets connected to the pipeline.
Versions Tab
View and manage the pipeline's version history.
- Version List: Displays all pipeline versions in chronological order.
- Version Restore: Restore to a previous version.
Relationship Between Pipelines and Collection Items
Each step of a pipeline can reference other items within the same collection.
- Code Reference: Specify a code artifact name from the collection in the step's
scriptfield. - Dataset Input: Map a dataset from the collection as input data in the step's
inputs. - Dataset Output: Specify a dataset to store processing results in the step's
outputs.
Grouping pipelines and related resources together at the collection level lets you see all the components needed for a data processing task at a glance.
Navigate to Pipeline Editor
To visually edit the pipeline structure from the collection view, click the Open Pipeline Editor button. The pipeline editor allows the following tasks:
- Adding and removing nodes (steps)
- Setting dependencies between steps
- Mapping input/output datasets
- Connecting code artifacts
Duplicating Pipelines
Use the pipeline's Duplicate feature to duplicate a pipeline. You can specify the same collection or a different collection as the target to create a copy.
For detailed usage of the pipeline editor, see the Workflow Editor documentation.
Next Steps
- Workflow Editor — How to use the pipeline visual editor
- Adding Nodes — Adding nodes to pipelines
- Running Pipelines — Pipeline execution and monitoring
- Version Control — Resource version management system