13. Create Feature List
Create a feature list¶
A feature list is an essential component in machine learning, comprising a collection of features that are used to train models. Let's compile a feature list using some of the features we've created.
For additional features:
- Visit the 'Learn by example' section for a variety of features tailored to different entities and signals.
- Check out the 'Bring Your Own Transformer' tutorials to learn about integrating Large Language Models (LLMs) within the FeatureByte ecosystem.
For those in an enterprise setting, explore 'Ideate Features with FeatureByte Copilot' which adopts an agentic approach to ideate features tailored to your use case.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
14:30:17 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:30:17 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:30:17 | INFO | Using profile: tutorial INFO :featurebyte:Using profile: tutorial 14:30:17 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml INFO :featurebyte:Using configuration file at: /Users/gxav/.featurebyte/config.yaml 14:30:17 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) INFO :featurebyte:Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 14:30:17 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:30:17 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:30:17 | INFO | Catalog activated: Grocery Dataset Tutorial INFO :featurebyte.api.catalog:Catalog activated: Grocery Dataset Tutorial 16:11:44 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 16:11:44 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 16:11:44 | WARNING | Remote SDK version (1.1.0.dev7) is different from local (1.1.0.dev1). Update local SDK to avoid unexpected behavior. 16:11:44 | INFO | No catalog activated. 16:11:44 | INFO | Catalog activated: Grocery Dataset Tutorial
List all features we created so far¶
In [2]:
Copied!
catalog.list_features()
catalog.list_features()
Out[2]:
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 683d4463b4cfc2806a63dcb2 | CUSTOMER_Mean_vector_of_item_product_ProductGr... | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [GROCERYINVOICE, INVOICEITEMS] | [customer] | [customer] | 2025-06-02T06:29:21.284000 |
1 | 683d44313a504320cf0dbb86 | CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer] | [customer] | 2025-06-02T06:27:10.713000 |
2 | 683d43364b16d208b726da64 | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:22:51.361000 |
3 | 683d42ebc133ec5ab4a622c8 | CUSTOMER_Std_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:49.200000 |
4 | 683d42ebc133ec5ab4a622c7 | CUSTOMER_Std_of_invoice_Amount_14d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:48.870000 |
5 | 683d42ebc133ec5ab4a622c6 | CUSTOMER_Avg_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:48.540000 |
6 | 683d42ebc133ec5ab4a622c5 | CUSTOMER_Avg_of_invoice_Amount_14d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:48.194000 |
7 | 683d42ebc133ec5ab4a622c4 | CUSTOMER_Count_of_invoice_28d | INT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:47.877000 |
8 | 683d42ebc133ec5ab4a622c3 | CUSTOMER_Count_of_invoice_14d | INT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:47.619000 |
9 | 683d42ebc133ec5ab4a622c2 | CUSTOMER_Latest_invoice_Amount | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2025-06-02T06:21:47.232000 |
10 | 683d42ebc133ec5ab4a622be | CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_28d | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer, productgroup] | [customer, productgroup] | 2025-06-02T06:21:46.932000 |
11 | 683d42ebc133ec5ab4a622bd | CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer, productgroup] | [customer, productgroup] | 2025-06-02T06:21:46.567000 |
12 | 683d42ebc133ec5ab4a622c1 | CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Time... | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer, productgroup] | [customer, productgroup] | 2025-06-02T06:21:46.025000 |
13 | 683d426c59a4af633ab188d6 | CUSTOMER_Age_band | VARCHAR | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2025-06-02T06:20:46.476000 |
14 | 683d426c59a4af633ab188cc | CUSTOMER_Age | INT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2025-06-02T06:20:39.849000 |
Get features from catalog¶
In [3]:
Copied!
customer_age_band = catalog.get_feature("CUSTOMER_Age_band")
customer_latest_invoice_amount = catalog.get_feature("CUSTOMER_Latest_invoice_Amount")
customer_count_of_invoice_14d = catalog.get_feature("CUSTOMER_Count_of_invoice_14d")
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
customer_std_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_14d")
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d = catalog.get_feature(
"CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d"
)
customer_vs_overall_item_totalcost_across_product_productgroups_26w = catalog.get_feature(
"CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w"
)
customer_x_productgroup_sum_of_item_totalcost_14d = \
catalog.get_feature("CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d")
customer_x_productgroup_time_since_latest_timestamp = \
catalog.get_feature("CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Timestamp")
customer_age_band = catalog.get_feature("CUSTOMER_Age_band")
customer_latest_invoice_amount = catalog.get_feature("CUSTOMER_Latest_invoice_Amount")
customer_count_of_invoice_14d = catalog.get_feature("CUSTOMER_Count_of_invoice_14d")
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
customer_std_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_14d")
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d = catalog.get_feature(
"CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d"
)
customer_vs_overall_item_totalcost_across_product_productgroups_26w = catalog.get_feature(
"CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w"
)
customer_x_productgroup_sum_of_item_totalcost_14d = \
catalog.get_feature("CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d")
customer_x_productgroup_time_since_latest_timestamp = \
catalog.get_feature("CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Timestamp")
Create feature list¶
In [4]:
Copied!
simple_feature_list = fb.FeatureList(
[
customer_age_band,
customer_latest_invoice_amount,
customer_count_of_invoice_14d,
customer_avg_of_invoice_amount_14d,
customer_std_of_invoice_amount_14d,
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d,
customer_vs_overall_item_totalcost_across_product_productgroups_26w,
customer_x_productgroup_sum_of_item_totalcost_14d,
customer_x_productgroup_time_since_latest_timestamp,
],
name="Customer x ProductGroup Simple FeatureList",
)
simple_feature_list = fb.FeatureList(
[
customer_age_band,
customer_latest_invoice_amount,
customer_count_of_invoice_14d,
customer_avg_of_invoice_amount_14d,
customer_std_of_invoice_amount_14d,
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d,
customer_vs_overall_item_totalcost_across_product_productgroups_26w,
customer_x_productgroup_sum_of_item_totalcost_14d,
customer_x_productgroup_time_since_latest_timestamp,
],
name="Customer x ProductGroup Simple FeatureList",
)
Preview feature list¶
In [5]:
Copied!
# Check the primary entity of the feature list
simple_feature_list.primary_entity
# Check the primary entity of the feature list
simple_feature_list.primary_entity
Out[5]:
[<featurebyte.api.entity.Entity at 0x16bd0cef0> { 'name': 'customer', 'created_at': '2025-06-02T06:17:26.675000', 'updated_at': '2025-06-02T06:17:28.925000', 'description': None, 'serving_names': [ 'GROCERYCUSTOMERGUID' ], 'catalog_name': 'Grocery Dataset Tutorial' }, <featurebyte.api.entity.Entity at 0x17b3dcf40> { 'name': 'productgroup', 'created_at': '2025-06-02T06:17:27.207000', 'updated_at': '2025-06-02T06:17:29.690000', 'description': None, 'serving_names': [ 'PRODUCTGROUP' ], 'catalog_name': 'Grocery Dataset Tutorial' }]
In [6]:
Copied!
# Get observation table: 'Preview Table with 10 items'
preview_table = catalog.get_observation_table("Preview Table with 10 items")
# Get observation table: 'Preview Table with 10 items'
preview_table = catalog.get_observation_table("Preview Table with 10 items")
In [7]:
Copied!
# Preview simple_feature_list
simple_feature_list.preview(preview_table)
# Preview simple_feature_list
simple_feature_list.preview(preview_table)
Out[7]:
POINT_IN_TIME | GROCERYINVOICEITEMGUID | CUSTOMER_Age_band | CUSTOMER_Latest_invoice_Amount | CUSTOMER_Count_of_invoice_14d | CUSTOMER_Avg_of_invoice_Amount_14d | CUSTOMER_Std_of_invoice_Amount_14d | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d | CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w | CUSTOMER_x_PRODUCTGROUP_Sum_of_item_TotalCost_14d | CUSTOMER_x_PRODUCTGROUP_Time_Since_Latest_Timestamp | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2022-07-23 15:32:29 | d87d65b8-4f78-41cc-8bd3-0064f83fe4fb | 50-54 | 2.50 | 8.0 | 14.246250 | 14.128027 | -1.067855 | 0.804626 | 5.00 | 7.715000 |
1 | 2023-02-28 10:28:24 | 59b63729-d448-4496-8f36-de26a91e2310 | 80-84 | 183.32 | 2.0 | 144.770000 | 38.550000 | 1.000000 | 0.728330 | 19.10 | 334.700556 |
2 | 2023-05-25 05:21:12 | cf670b7a-c6bf-4598-b0c0-400378b9cab6 | 20-24 | 2.99 | 6.0 | 4.273333 | 3.607879 | -0.393856 | 0.493392 | 0.99 | 269.470000 |
3 | 2023-04-26 19:36:57 | f867935a-d33a-43d1-b3bc-02c539769836 | 75-79 | 1.00 | 8.0 | 5.066250 | 8.702607 | -0.613917 | 0.816872 | NaN | 8904.007778 |
4 | 2023-01-06 14:38:32 | e63c0f14-3530-49e9-b73e-f92594e82663 | 40-44 | 22.94 | 1.0 | 22.940000 | 0.000000 | 1.000000 | 0.770818 | 2.79 | 315.693056 |
5 | 2023-06-09 16:38:32 | 6acb20fd-605d-4982-aa39-77054f08103c | 30-34 | 0.88 | 5.0 | 15.198000 | 19.734034 | -1.014358 | 0.627731 | NaN | 965.136389 |
6 | 2023-03-20 15:08:44 | 6f5299d0-fa38-4707-8108-1b66805d84e5 | 20-24 | 35.00 | 4.0 | 35.472500 | 17.705779 | -0.146625 | 0.393753 | 2.99 | 241.958889 |
7 | 2023-05-04 15:15:25 | 099cb405-5b2d-4dba-9071-a157ff0dbadc | 55-59 | 24.16 | 2.0 | 14.825000 | 9.335000 | 1.494143 | 0.610841 | 0.79 | 219.637222 |
8 | 2022-11-01 14:32:22 | 8687e2a4-7f97-4442-873c-5c52d74404f8 | 40-44 | 1.43 | 6.0 | 14.155000 | 15.361576 | -0.623232 | 0.706009 | 3.64 | 284.610278 |
9 | 2023-06-02 14:24:28 | 0154e4b4-25a4-4276-af72-2826bbc64c31 | 25-29 | 3.46 | NaN | NaN | NaN | NaN | 0.774300 | NaN | 745.476667 |
Save feature list¶
In [8]:
Copied!
# Save feature list
simple_feature_list.save()
# Add description
simple_feature_list.update_description("Simple feature list for the customer x productgroup engagement")
# Save feature list
simple_feature_list.save()
# Add description
simple_feature_list.update_description("Simple feature list for the customer x productgroup engagement")
Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Loading Feature(s) |████████████████████████████████████████| 9/9 [100%] in 0.2s Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Loading Feature(s) |████████████████████████████████████████| 9/9 [100%] in 0.4s