13. Compute Historical Feature Values
Compute historical feature values¶
Historical feature values are needed to train and test Machine Learning models.
Let's take the feature list we just created and compute feature values for a given observation table.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Credit Default Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Credit Default Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
18:04:49 | INFO | SDK version: 3.3.1 18:04:49 | INFO | No catalog activated. 18:04:49 | INFO | Using profile: tutorial 18:04:49 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 18:04:49 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 18:04:49 | INFO | SDK version: 3.3.1 18:04:49 | INFO | No catalog activated. 18:04:50 | INFO | Catalog activated: Credit Default Dataset SDK Tutorial 16:12:25 | INFO | No catalog activated. 16:12:25 | INFO | Catalog activated: Grocery Dataset Tutorial
List feature lists in Catalog¶
In [2]:
Copied!
catalog.list_feature_lists()
catalog.list_feature_lists()
Out[2]:
| id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | primary_entity | created_at | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 69315caf133daa6ca04988cd | 40 features for Credit Default | 40 | DRAFT | False | 0.0 | 0.0 | [NEW_APPLICATION, CLIENT_PROFILE, BUREAU, INST... | [New Application, Client] | [New Application] | 2025-12-04T10:04:36.571000 |
Get Feature List from Catalog¶
In [3]:
Copied!
feature_list_name = "40 features for Credit Default"
simple_feature_list = catalog.get_feature_list(feature_list_name)
feature_list_name = "40 features for Credit Default"
simple_feature_list = catalog.get_feature_list(feature_list_name)
Loading Feature(s) |████████████████████████████████████████| 40/40 [100%] in 0.
Get an observation table¶
In [4]:
Copied!
# List observation tables
catalog.list_observation_tables()
# List observation tables
catalog.list_observation_tables()
Out[4]:
| id | name | type | shape | feature_store_name | created_at | |
|---|---|---|---|---|---|---|
| 0 | 69315bf8565d52c79a28f8e1 | Applications up to March 2025 | source_table | [307511, 3] | playground | 2025-12-04T10:01:41.876000 |
In [5]:
Copied!
# Get observation table: 'Applications up to March 2025'
training_observations = catalog.get_observation_table("Applications up to March 2025")
# Get observation table: 'Applications up to March 2025'
training_observations = catalog.get_observation_table("Applications up to March 2025")
Compute historical features¶
In [6]:
Copied!
# Create training data
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=f"{feature_list_name} - TRAIN",
)
# Create training data
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=f"{feature_list_name} - TRAIN",
)
Done! |████████████████████████████████████████| 100% in 2:12.6 (0.01%/s) Done! |████████████████████████████████████████| 100% in 36.4s (0.03%/s)
In [7]:
Copied!
# List historical feature tables from catalog
catalog.list_historical_feature_tables()
# List historical feature tables from catalog
catalog.list_historical_feature_tables()
Out[7]:
| id | name | feature_store_name | observation_table_name | shape | created_at | |
|---|---|---|---|---|---|---|
| 0 | 69315cc4da8963295e76f9d5 | 40 features for Credit Default - TRAIN | playground | Applications up to March 2025 | [307511, 43] | 2025-12-04T10:07:01.157000 |
Concepts in this tutorial¶
SDK reference for¶
- Historical feature table
- FeatureList.compute historical feature table()
- FeatureList.compute_historical_features() to compute directly a data frame