Skip to content

Semantic Layer Configuration Documentation

Overview

The Semantic Layer configuration provides a structured way to define analytics data models, transformations, and metrics. It consists of several YAML configuration files that define events, entity properties, and Key Performance Indicators (KPIs).

Core Components

1. Event Logical Tables

Event tables define the source data structure and basic metadata.

yaml
# Example structure
table_name: [schema].[table_name]
tags: [tag1, tag2]
columns:
  column_name:
    data_type: [DATA_TYPE]
    tags: [optional_column_tags]

Key concepts:

  • Each table can have multiple tags for categorization
  • Columns can have specific data types and tags
  • Special columns can be tagged (e.g., entity_id_column, event_timestamp_column)

Example:

yaml
table_name: demodata.ad_watched
tags: [registration_event, activity_event]
columns:
  user_id:
    data_type: STRING
    tags: [entity_id_column]
  _time:
    data_type: DATETIME
    tags: [event_timestamp_column]

Available Table Tags

  • registration_event: Indicates the table will be used to generate first appereance internal data structure
  • activity_event: Marks tables that track user activity events (as opposed to system driven events like notifications)

Example:

yaml
table_name: demodata.session_start
tags: [registration_event, activity_event]

Column Tags

Tags that define special roles or characteristics of columns.

Special Role Tags
  • entity_id_column: Marks the column that contains the unique identifier for entities (typically user ID)

    yaml
    user_id:
      data_type: STRING
      tags: [entity_id_column]
  • event_timestamp_column: Identifies the column containing the event timestamp

    yaml
    _time:
      data_type: DATETIME
      tags: [event_timestamp_column]
  • date_column: Marks columns representing dates for date-based calculations

    yaml
    date:
      data_type: DATE
      tags: [date_column]
  • registration_property: Indicates columns that contain properties set at user registration

    yaml
    _platform:
      data_type: STRING
      tags: [registration_property]

2. Entity Properties

Entity properties define how metrics are calculated and transformed.

Properties can be tagged with boolean flags that control their usage:

yaml
property_name:
  label: [optional] # default is to capitalize name -> Property Name
  data_type: TYPE
  can_filter: boolean    # Controls if property can be used in filters
  can_group_by: boolean  # Controls if property can be used in GROUP BY clauses
  event_property/computed_property/sliding_window_property/lifetime_property: [formula]

There are several types of properties:

a. Event Properties

Direct measurements from events with optional aggregation.

yaml
event_property:
  source_event: [event_name] # from the events yaml
  select: [{event_parameter}/expression]
  aggregate_function: [sum/count/max/etc]
  default_value: [optional_default]
  where: [optional_condition]

Example:

yaml
rewarded_ads_watched:
  data_type: NUMBER
  can_filter: true
  can_group_by: true
  event_property:
    source_event: ad_watched
    select: 'COUNT(*)'
    where: "{status} = 'rewarded'"
    aggregate_function: none
    default_value: 0

b. Computed Properties

Derived calculations based on other properties.

yaml
computed_property:
  select: [expression]
  # optional mapping
  value_mappings:
    - range: {from: value1, to: value2}
      new_value: [mapped_value]
    # or
    - constant: [value]
      new_value: [mapped_value]

Example:

yaml
iap_payment_segment:
  data_type: STRING
  can_filter: true
  can_group_by: true
  computed_property:
    select: '{iap_revenue_lifetime}'
    value_mappings:
      - range: {to: 0}
        new_value: Non Payer
      - range: {from: 0, to: 20}
        new_value: Minnow
      - range: {from: 20}
        new_value: Whale

c. Sliding Window Properties

Rolling calculations over a specified time window.

yaml
sliding_window_property:
  source_property: [base_property]
  window_function: [aggregation]
  relative_days_from: [start] # 0 is the current day
  relative_days_to: [end]     # 0 is the current day

Example:

yaml
days_active_last_7_days:
  data_type: NUMBER
  can_filter: true
  can_group_by: true
  sliding_window_property:
    source_property: {dau_active}
    window_function: sum
    relative_days_from: -6
    relative_days_to: 0

d. Lifetime Properties

Cumulative calculations over the entire history.

yaml
lifetime_property:
  source_property: [base_property]
  merge_function: [aggregation]
  # or
  source_event_property:
    source_event: [event_name]
    select: [expression]
    aggregate_function: [function]

Example:

yaml
  payers_lifetime:
    data_type: INTEGER
    can_filter: true
    can_group_by: true
    lifetime_property: {
        source_property: {payers_on_day}, 
        merge_function: max}

3. KPI Definitions

KPIs define high-level metrics for analysis and reporting.

Structure:

yaml
kpi_name:
  label: [optional_display_name]
  select: [calculation_expression]
  where: [optional_filter]
  unit:
    symbol: [unit_symbol]
    is_prefix: [boolean]
  x_axis:
    dimension_name:
      total_function: [optional_aggregation]

Key features:

  • Support for templated KPIs using {}
  • Flexible dimension configuration via x_axis
  • Optional filtering conditions
  • Unit specification

Example:

yaml
retention_d{}:
  select: SAFE_DIVIDE({kpi.dau} * 100, SUM({property.cohort_size}))
  where: '{property.cohort_day} = {}'
  unit: {symbol: '%', is_prefix: false}
  x_axis:
    date: {}
  template: cohort_day

Property Configuration Options

Common Attributes

All properties can have:

  • data_type: The property's data type
  • can_filter: Whether the property can be used in filters
  • can_group_by: Whether the property can be used for grouping

Special Features

  1. Value Mappings: Transform numeric or categorical values
  2. Template Support: Generate multiple similar metrics
  3. Conditional Aggregation: Apply aggregations with filters
  4. Window Functions: Rolling calculations
  5. Custom Expressions: Complex calculations using SQL-like syntax

Best Practices

  1. Organization

    • Group related properties and KPIs in separate files
    • Use consistent naming conventions
    • Document complex calculations
  2. Property Design

    • Choose appropriate property types for calculations
    • Set reasonable default values
    • Consider performance implications of window functions
  3. KPI Design

    • Provide clear labels and units
    • Use templating for related metrics
    • Define appropriate aggregations for dimensions
  4. Schema Validation

    • Always reference appropriate schemas
    • Validate configurations before deployment
    • Keep schemas up to date

Schema References

Set up to have automatic linting for these files. Available schemas:

  • event_logical_table.json: Event table definitions
  • entity_properties.json: Property definitions
  • entity_kpis.json: KPI definitions

To utilize yaml linting, each configuration file should reference the appropriate schema:

yaml
# $schema: http://schema.asemicanalytics.com/v1/semantic_layer/[schema_file]