7. Create Observation Tables
Create Observation Table¶
An Observation Set is a collection that combines specific moments in history (timestamps) with related entity key values, used to determine feature values for those moments. Think of it as the backbone of a training dataset.
An Observation Table is its representation in the feature store.
You can either:
- upload an Observation Table from a parquet or csv file
- create it from a View,
- or create it from a Source Table.
This guide explains how to configure Observation Tables from a Source Table and link them to our Credit Default context and use case.
We will create Applications up to March 2025: Credit Default Observations for training up to March 2025.
For an example how to upload an Observation Table or create it from a view, check out the Grocery SDK Tutorial. The tutorial also covers how to add Target values when your target has been registered with a logical approach.
import featurebyte as fb
import pandas as pd
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Credit Default Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
16:59:37 | INFO | SDK version: 3.2.0.dev66 INFO :featurebyte:SDK version: 3.2.0.dev66 16:59:37 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 16:59:37 | INFO | Using profile: staging INFO :featurebyte:Using profile: staging 16:59:37 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml INFO :featurebyte:Using configuration file at: /Users/gxav/.featurebyte/config.yaml 16:59:37 | INFO | Active profile: staging (https://staging.featurebyte.com/api/v1) INFO :featurebyte:Active profile: staging (https://staging.featurebyte.com/api/v1) 16:59:37 | INFO | SDK version: 3.2.0.dev66 INFO :featurebyte:SDK version: 3.2.0.dev66 16:59:37 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 16:59:37 | INFO | Catalog activated: Credit Default Dataset SDK Tutorial INFO :featurebyte.api.catalog:Catalog activated: Credit Default Dataset SDK Tutorial 16:06:21 | WARNING | Remote SDK version (1.1.0.dev7) is different from local (1.1.0.dev1). Update local SDK to avoid unexpected behavior. 16:06:21 | INFO | No catalog activated. 16:06:21 | INFO | Catalog activated: Grocery Dataset Tutorial
Locate Source Table¶
ds = catalog.get_data_source()
DATABASE_NAME = "DEMO_DATASETS"
SCHEMA_NAME = "CREDIT_DEFAULT"
training_observations = ds.get_source_table(
database_name=DATABASE_NAME,
schema_name=SCHEMA_NAME,
table_name="OBSERVATIONS_WITH_TARGET",
)
Locate the Target, Context and Use Case, the observation tables will be linked to¶
context_name = "New Loan Application"
target_name = "Loan_Default"
use_case_name = "Loan Default by client"
usecase = catalog.get_use_case(use_case_name)
Create Applications up to March 2025 table¶
observation_train_table_name = "Applications up to March 2025"
observation_train_table = training_observations.create_observation_table(
name=observation_train_table_name,
sample_rows=None,
sample_from_timestamp="2019-04-01",
sample_to_timestamp="2025-04-01",
context_name=context_name,
primary_entities=["New Application"],
target_column=target_name,
)
observation_train_table.update_purpose(fb.Purpose.TRAINING)
# link it to the use case
usecase.add_observation_table(observation_train_table_name)
Done! |████████████████████████████████████████| 100% in 18.3s (0.06%/s) Done! |████████████████████████████████████████| 100% in 15.2s (0.07%/s)
List observation tables in catalog¶
catalog.list_observation_tables()
id | name | type | shape | feature_store_name | created_at | |
---|---|---|---|---|---|---|
0 | 68ef627ae645921d591abdd9 | Applications up to March 2025 | source_table | [307511, 3] | playground | 2025-10-15T08:59:47.822000 |