Skip to main content
Version: v0.1.0

Debugging

This guide explains tools and methods for analyzing and resolving issues that occur during pipeline execution. D.Hub records execution history at the pipeline, batch, and step levels to support systematic debugging.

Execution History (Traces)

Detailed execution records (Traces) are automatically saved each time a pipeline runs.

Execution History Structure

The execution history is organized into three levels:

LevelDescription
PipelineThe overall execution unit for the pipeline
BatchAn individual execution instance of the pipeline (a new batch is created for each run)
StepThe execution unit for each node within a batch

Viewing Execution History

  1. Click a specific execution run in the Run History Bar at the top of the workflow editor.
  2. Or select a pipeline from the pipeline list and open the execution history tab.

Information available for each execution run:

ItemDescription
Batch IDUnique identifier for the execution instance
Start/End TimeExecution start and completion time
DurationTotal time taken for the execution
StatusExecution result (SUCCESS, FAILED, etc.)
TriggerExecution cause (manual run, schedule, event)

Batch Status

Each batch has one of the following statuses:

StatusDescriptionFollow-up Action
RUNNINGCurrently in executionWait for completion or stop with Stop if needed
SUCCESSAll steps completed successfullyCheck output datasets
FAILEDAn error occurred in one or more stepsAnalyze the error, fix, and re-run
CANCELLEDExecution cancelled by userRe-run if needed

Error Analysis

Checking Step-level Logs

Steps for analyzing a failed batch:

  1. Select the failed batch: Click the batch shown in red (failed) in the Run History.
  2. Identify the failed step: Look for nodes with red borders on the canvas.
  3. Check the Inspector panel: Click the failed node to view error details in the Inspector panel.
  4. Check the History tab: Review the batch execution history and detailed logs in the Code node's History tab.

Interpreting Error Messages

Error information available in the Inspector panel:

ItemDescription
Error TypeError type (ConnectionError, ValueError, etc.)
Error MessageMessage describing the error cause
Stack TraceDetailed call stack for Python code
TimestampTime when the error occurred

Common Error Types and Solutions

Error TypeCauseSolution
ConnectionErrorFailed to connect to data source (DB, API)Check network status, credentials, endpoint address
SchemaErrorInput/output data schema mismatchVerify column names and data type mapping
SyntaxErrorPython/SQL code syntax errorFix the code syntax and re-run
RuntimeErrorException during code executionAnalyze the stack trace and fix the logic
TimeoutErrorExecution time exceededOptimize queries, reduce data size, adjust timeout values
ResourceErrorInsufficient CPU/memory resourcesIncrease resource limits or process data in smaller batches
tip

If errors keep recurring in a Code node, adding logging (print or logging) within the code to check intermediate data states can help identify the cause more quickly.

Step-level Detailed Traces

The following information is available in each step's trace:

ItemDescription
Execution TimeDuration from step start to end
Input Record CountNumber of records input to the step
Output Record CountNumber of records output from the step
LogsLog messages output during step execution
ErrorsDetailed information about errors that occurred

Re-running Batches

You can re-run a failed batch after making corrections.

How to Re-run

  1. Analyze the failure cause and fix the code or settings.
  2. Save the changes.
  3. Click the Run button on the top toolbar to create a new batch.
warning

A new batch is created when re-running. The history of the previously failed batch is preserved, so it can be used for comparative analysis.

Debugging Best Practices

  1. Log review order: Narrow the scope by analyzing in the order Pipeline → Batch → Step.
  2. Incremental testing: For complex pipelines, add nodes one at a time and verify each stage.
  3. Test with small data: Test with a small sample dataset before running on the full dataset.
  4. Compare with previous successful batches: Compare input data and settings between the failed batch and the last successful batch.