Skip to content

7. Create Observation Tables

What is an Observation Table?

An Observation Table is a structured collection of historical data points that acts as the foundation for training datasets. By adding features, you can create Feature Tables that can be used to train and validate Machine Learning models.

Each data point represents a specific historical moment for a particular entity and may also include target values. Observation Tables are often utilized across experiments within the same use case, even if selected features and models vary.

How to create an Observation Table?

You can either upload an Observation Table from a parquet or csv file or create it from a Source Table.

This guide explains how to configure Observation Tables from a Source Table and link them to our Credit Default context and use case.

We will create four Observation Tables:

  1. Applications up to Sept 2024 with Loan Defaults: Credit Default Observations for TRAINING up to Sept 2024.
  2. Applications Oct 2024 with Loan Defaults: Credit Default Observations for HOLDOUT (Oct 2024).
  3. 50K applications: Credit Default Observations for EDA .
  4. Applications Preview: 50 Credit Default Observations for Feature PREVIEW.

Check out

For an example how to upload an Observation Table, check out the Grocery UI Tutorial. The tutorial also covers how to add Target values when your target has been registered with a logical approach.


Step 1: Navigate to Observation Table Catalog

From the menu, navigate to the 'Formulate' section and select the Observation Table catalog.

Empty Observation Table Catalog


Step 2: Create Observation Tables from a Source Table

  1. Click Image.
  2. Select 'Derive from Source Table' tab and Click Image Name

  3. In the Source Table catalog, we will use OBSERVATIONS_WITH_TARGET and OBSERVATION_EDA_TABLE under the DEMO_DATASETS database and the CREDIT_DEFAULT schema and click Image.

Name

Name


Step 2a: Create 'Applications up to Sept 2024 with Loan Defaults' table:

  1. In the Source Table catalog, select OBSERVATIONS_WITH_TARGET and click Image.

  2. Set the table as follows:

    • Name: "Applications up to Sept 2024 with Loan Defaults"
    • Description: "Credit Default Observations for training up to Sept 2024."
    • Purpose: Training
    • Sample Rows: 0
    • Primary Entity: New Application
    • Sampling Date Range: January 1, 2018 - Oct 1, 2024
    • Columns to Include:

      1. Original Column Name: POINT_IN_TIME --> New Column Name: POINT_IN_TIME
      2. Original Column Name: SK_ID_CURR --> New Column Name: SK_ID_CURR
      3. Original Column Name: Loan_Default --> New Column Name: Loan_Default (as Target)

    Disable sampling

    To disable sampling, set the Sample Rows to 0.

  3. Ensure Loan_Default is identified as the target.

    Name

  4. Click Image to save the table.


Step 2b: Create 'Applications Oct 2024 with Loan Defaults' table:

  1. In the Source Table catalog, select OBSERVATIONS_WITH_TARGET and click Image.

  2. Set the holdout observation table as follows:

    • Name: "Applications Oct 2024 with Loan Defaults"
    • Description: "Credit Default Observations for holdout (Oct 2024)"
    • Purpose: Validation-Test
    • Sample Rows: 0
    • Primary Entity: New Application
    • Sampling Date Range: October 1, 2024 - November 1, 2024
    • Columns to Include:

      1. Original Column Name: POINT_IN_TIME --> New Column Name: POINT_IN_TIME
      2. Original Column Name: SK_ID_CURR --> New Column Name: SK_ID_CURR
      3. Original Column Name: Loan_Default --> New Column Name: Loan_Default (as Target)
  3. Click Image to save the table.

Name


Step 2c: Create 50K applications up to Oct 2024 table:

  1. In the Source Table catalog, select OBSERVATION_EDA_TABLE and click Image.

  2. Set the EDA observation table as follows:

    • Name: "50K applications"
    • Description: "Credit Default Observations for EDA."
    • Purpose: EDA
    • Sample Rows: 0
    • Primary Entity: New Application
    • Sampling Date Range: None
    • Columns to Include:

      1. Original Column Name: POINT_IN_TIME --> New Column Name: POINT_IN_TIME
      2. Original Column Name: SK_ID_CURR --> New Column Name: SK_ID_CURR
      3. Original Column Name: Loan_Default --> New Column Name: Loan_Default (as Target)
  3. Click Image to save the table.

Name


Step 2d: Create 'Applications Preview' table:

  1. In the Source Table catalog, select OBSERVATION_EDA_TABLE and click Image.

  2. Set the preview observation table as follows:

    • Name: "Applications Preview"
    • Description: "50 Credit Default Observations for preview."
    • Purpose: Preview
    • Sample Rows: 50
    • Primary Entity: New Application
    • Sampling Date Range: None
    • Columns to Include:

      1. Original Column Name: POINT_IN_TIME --> New Column Name: POINT_IN_TIME
      2. Original Column Name: SK_ID_CURR --> New Column Name: SK_ID_CURR
      3. Original Column Name: Loan_Default --> New Column Name: Loan_Default (as Target)
  3. Click Image to save the table.

Name


  1. Navigate to the Context Catalog and select the "New Loan Application" context.

    Name

  2. In the 'About' tab, click Image under the Observation tables section and select one of the four Observation Tables. Confirm selection by clicking Image.

    Name

  3. Repeat it for each of the four Observation Tables.

Name


  1. Navigate to the Use Case Catalog and select the "Loan Default by client" use case.

    Name

  2. In the 'About' tab, click Image under the Observation tables section and select one of the four Observation Tables. Confirm selection by clicking Image.

  3. Repeat it for each of the four Observation Tables.

  4. Set "50K applications" as the EDA Table.

  5. Set "Applications Preview" as the Preview Table.

Name


Step 5: Check Observation Tables

Check successful registration by reviewing the Use Case Catalog.

Name

Check successful registration by reviewing the Observation Table Catalog.

Name