14. Compute Historical Feature Values
Compute historical feature values¶
Historical feature values are needed to train and test Machine Learning models.
Let's take the feature list we just created and compute feature values for a given observation table.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
16:12:25 | WARNING | Service endpoint is inaccessible: http://featurebyte-server:8088 16:12:25 | INFO | Using profile: tutorial 16:12:25 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 16:12:25 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 16:12:25 | WARNING | Remote SDK version (1.1.0.dev7) is different from local (1.1.0.dev1). Update local SDK to avoid unexpected behavior. 16:12:25 | INFO | No catalog activated. 16:12:25 | INFO | Catalog activated: Grocery Dataset Tutorial
List feature lists in Catalog¶
In [2]:
Copied!
catalog.list_feature_lists()
catalog.list_feature_lists()
Out[2]:
id | name | num_feature | status | deployed | readiness_frac | online_frac | tables | entities | primary_entity | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 666958439767025aff191450 | Customer x ProductGroup Simple FeatureList | 9 | DRAFT | False | 0.0 | 0.0 | [GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS... | [customer, productgroup] | [customer, productgroup] | 2024-06-12T08:12:14.787000 |
Get Feature List from Catalog¶
In [3]:
Copied!
simple_feature_list = catalog.get_feature_list("Customer x ProductGroup Simple FeatureList")
simple_feature_list = catalog.get_feature_list("Customer x ProductGroup Simple FeatureList")
Loading Feature(s) |████████████████████████████████████████| 9/9 [100%] in 0.2s
Get an observation table¶
In [4]:
Copied!
# List observation tables
catalog.list_observation_tables()
# List observation tables
catalog.list_observation_tables()
Out[4]:
id | name | type | shape | feature_store_name | created_at | |
---|---|---|---|---|---|---|
0 | 66695726850bd33441fdc242 | Preview Table with 10 items | view | [10, 2] | playground | 2024-06-12T08:07:09.337000 |
1 | 6669570fddd5be620a410f7f | In_Store_Customer_x_ProductGroup_Spending_next... | observation_table | [1000, 4] | playground | 2024-06-12T08:06:54.934000 |
2 | 666956fdddd5be620a410f7c | In_Store_Customer_x_ProductGroup_2023_1K | uploaded_file | [1000, 3] | playground | 2024-06-12T08:06:33.543000 |
In [5]:
Copied!
# Get observation table: 'In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K'
training_observations = catalog.get_observation_table(
"In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
)
# Get observation table: 'In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K'
training_observations = catalog.get_observation_table(
"In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
)
Compute historical features¶
In [6]:
Copied!
# Create historical feature table
table_name =\
"Simple Training Simple Training for In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=table_name,
)
# Create historical feature table
table_name =\
"Simple Training Simple Training for In_Store_Customer_x_ProductGroup_Spending_next_2_weeks_2023_1K"
training_data_table = simple_feature_list.compute_historical_feature_table(
training_observations,
historical_feature_table_name=table_name,
)
Done! |████████████████████████████████████████| 100% in 36.4s (0.03%/s)
In [7]:
Copied!
display(training_data_table.to_pandas())
display(training_data_table.to_pandas())
Downloading table |████████████████████████████████████████| 1000/1000 [100%] in
GROCERYCUSTOMERGUID | POINT_IN_TIME | PRODUCTGROUP | CUSTOMER_x_PRODUCTGROUP_Sum_of_TotalCost_next_2_weeks | CUSTOMER_Age_band | CUSTOMER_Latest_invoice_Amount | CUSTOMER_Count_of_invoice_14d | CUSTOMER_Avg_of_invoice_Amount_14d | CUSTOMER_Std_of_invoice_Amount_14d | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d | CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w | CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d | CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Timestamp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 699efd7f-aba2-4515-9335-2c8040a94f9f | 2023-12-11 08:51:22 | Fromages | 14.18 | 80-84 | 13.13 | 4.0 | 13.860000 | 3.720329 | 0.179880 | 0.683171 | 6.00 | 166.499167 |
1 | 125dfe7d-eac0-4eab-94d8-1cd008e1641c | 2023-05-16 09:00:11 | Laits | 1.85 | 30-34 | 5.82 | 1.0 | 5.820000 | 0.000000 | -1.000000 | 0.645410 | NaN | 2653.102500 |
2 | 326b6ccb-0891-49fe-acbf-31d06c6d9e67 | 2023-03-20 13:34:55 | Céréales | 0.00 | 35-39 | 24.79 | 1.0 | 24.790000 | 0.000000 | 1.414202 | 0.624311 | NaN | 532.296944 |
3 | e42fa5f3-7737-4c6a-9ef4-856f113e60bd | 2023-12-18 19:04:45 | Fromages | 9.00 | 25-29 | 4.76 | 4.0 | 12.860000 | 7.439772 | -0.637723 | 0.649094 | 11.36 | 241.682222 |
4 | dde029d7-ceca-4e44-aad0-38e22ba11b74 | 2023-09-08 15:00:07 | Pains | 3.49 | 40-44 | 22.71 | 6.0 | 10.605000 | 8.144472 | 0.930581 | 0.740797 | 2.50 | 50.218611 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
995 | e883912f-82c4-4ca8-bfa9-0bdeb46dd4c5 | 2023-06-28 20:16:15 | Céréales | 0.00 | 70-74 | 3.00 | 1.0 | 3.000000 | 0.000000 | NaN | 0.724302 | NaN | 1994.139722 |
996 | cc96d96e-5d02-48dd-b742-d2a0ef633c43 | 2023-03-07 10:00:46 | Laits | 2.00 | 55-59 | 30.21 | NaN | NaN | NaN | NaN | 0.655556 | NaN | 984.909444 |
997 | 1b82b9eb-cc54-4cc4-a7e3-9a7417faa8a5 | 2023-11-16 20:44:02 | Laits | 2.69 | 40-44 | 4.00 | 3.0 | 8.143333 | 2.931033 | -1.413608 | 0.541741 | NaN | 1970.083056 |
998 | c0ca0bda-e7f5-4748-9b14-0e7ba9a07a47 | 2023-04-06 14:58:43 | Laits | 2.32 | 65-69 | 17.20 | 10.0 | 20.122000 | 13.270356 | -0.079666 | 0.808877 | 4.64 | 241.992500 |
999 | a0588833-ba78-41a4-b36a-d36bcd68e27e | 2023-09-16 13:40:19 | Fromages | 0.00 | 20-24 | 5.50 | 6.0 | 6.561667 | 2.902714 | -0.493589 | 0.699924 | NaN | 1578.428611 |
1000 rows × 13 columns
In [8]:
Copied!
### List historical feature tables from catalog
catalog.list_historical_feature_tables()
### List historical feature tables from catalog
catalog.list_historical_feature_tables()
Out[8]:
id | name | feature_store_name | observation_table_name | shape | created_at | |
---|---|---|---|---|---|---|
0 | 6669586bec7bc32effdab7a4 | Simple Training Simple Training for In_Store_C... | playground | In_Store_Customer_x_ProductGroup_Spending_next... | [1000, 13] | 2024-06-12T08:12:57.839000 |
Concepts in this tutorial¶
SDK reference for¶
- Historical feature table
- FeatureList.compute historical feature table()
- FeatureList.compute_historical_features() to compute directly a data frame