Skip to main content

Dataset Creation

This guide explains how to create a new dataset using the Template Wizard.

Step 1: Select Type

  1. Select the Dataset card.
  2. Choose a subtype.
    • Delta: General table-format data storage
    • Kafka / DDS: Real-time stream data
    • REST: External API integration

Step 2: Basic Info

[Screenshot] Basic info input screen in the dataset creation wizard

Define the dataset metadata.

  • Name: Dataset name (required)
  • Alias: Display name
  • Description: Dataset description
  • Tags: Search tags
  • Additional Settings (by type):
    • REST: Enter API Endpoint URL (required)
    • Kafka/DDS: Enter Topic name (AI auto-generation available)

Step 3: Schema

Define the dataset structure (columns). Three modes are supported.

UI Mode

[Screenshot] Dataset schema definition screen (in UI mode with columns added)

Add and configure columns in an intuitive table interface.

  • Add Column: Add a new column
  • Name: Column name
  • Type: Data type (Text, Integer, Decimal, Boolean, Timestamp, Date, etc.)
  • Nullable: Whether null values are allowed

JSON Mode

You can directly write or paste the schema in JSON format. This is useful for defining complex nested structures.

CSV Mode

Upload a sample CSV file to automatically infer the schema.

  • Drag and drop or select a file to upload, and the system will analyze the headers and data types to automatically generate the schema.

Completion

Once all settings are complete, click the Create (or Submit Dataset) button to create the dataset.