12. Create feature list
Create a feature list¶
Feature list is a collection of features which can be used for machine learning model.
Let's take some of features we created and bundle them into a feature list.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
22:02:03 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 22:02:03 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 22:02:03 | WARNING | Remote SDK version (0.5.0.dev6) is different from local (0.5.0.dev1). Update local SDK to avoid unexpected behavior. 22:02:03 | INFO | No catalog activated. 22:02:03 | INFO | 6 feature lists, 31 features deployed 22:02:03 | INFO | Using profile: tutorial 22:02:04 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 22:02:04 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 22:02:04 | WARNING | Remote SDK version (0.5.0.dev6) is different from local (0.5.0.dev1). Update local SDK to avoid unexpected behavior. 22:02:04 | INFO | No catalog activated. 22:02:04 | INFO | 6 feature lists, 31 features deployed 22:02:05 | INFO | Catalog activated: Grocery Dataset Tutorial
List all features we created so far¶
In [2]:
Copied!
catalog.list_features()
catalog.list_features()
Out[2]:
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 64ff1dc2183aab97b493af21 | CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... | FLOAT | DRAFT | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer] | [customer] | 2023-09-11T14:01:53.167000 |
1 | 64ff1d9a2fa89ef7c7f4f5b8 | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:01:19.807000 |
2 | 64ff1d6a98b637caa7897494 | CUSTOMER_Std_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:30.353000 |
3 | 64ff1d6a98b637caa7897493 | CUSTOMER_Std_of_invoice_Amount_14d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:29.117000 |
4 | 64ff1d6a98b637caa7897492 | CUSTOMER_Avg_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:27.834000 |
5 | 64ff1d6a98b637caa7897491 | CUSTOMER_Avg_of_invoice_Amount_14d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:26.527000 |
6 | 64ff1d6a98b637caa789748f | CUSTOMER_Count_of_invoice_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:25.244000 |
7 | 64ff1d6a98b637caa789748d | CUSTOMER_Count_of_invoice_14d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:24.165000 |
8 | 64ff1d6a98b637caa789748c | CUSTOMER_Latest_invoice_Amount | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:23.056000 |
9 | 64ff1d4682feedb4dc913349 | CUSTOMER_Age_band | VARCHAR | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:53.623000 |
10 | 64ff1d4182feedb4dc91333f | CUSTOMER_Age | INT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:43.782000 |
Get features from catalog¶
In [3]:
Copied!
customer_age_band = catalog.get_feature("CUSTOMER_Age_band")
customer_latest_invoice_amount = catalog.get_feature("CUSTOMER_Latest_invoice_Amount")
customer_count_of_invoice_14d = catalog.get_feature("CUSTOMER_Count_of_invoice_14d")
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
customer_std_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_14d")
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d = catalog.get_feature(
"CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d"
)
customer_vs_overall_item_totalcost_across_product_productgroups_26w = catalog.get_feature(
"CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w"
)
customer_age_band = catalog.get_feature("CUSTOMER_Age_band")
customer_latest_invoice_amount = catalog.get_feature("CUSTOMER_Latest_invoice_Amount")
customer_count_of_invoice_14d = catalog.get_feature("CUSTOMER_Count_of_invoice_14d")
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
customer_std_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_14d")
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d = catalog.get_feature(
"CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d"
)
customer_vs_overall_item_totalcost_across_product_productgroups_26w = catalog.get_feature(
"CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w"
)
Create feature list¶
In [4]:
Copied!
simple_feature_list = fb.FeatureList(
[
customer_age_band,
customer_latest_invoice_amount,
customer_count_of_invoice_14d,
customer_avg_of_invoice_amount_14d,
customer_std_of_invoice_amount_14d,
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d,
customer_vs_overall_item_totalcost_across_product_productgroups_26w
],
name="Customer Simple FeatureList"
)
simple_feature_list = fb.FeatureList(
[
customer_age_band,
customer_latest_invoice_amount,
customer_count_of_invoice_14d,
customer_avg_of_invoice_amount_14d,
customer_std_of_invoice_amount_14d,
customer_latest_invoice_amount_Z_score_to_invoice_amount_28d,
customer_vs_overall_item_totalcost_across_product_productgroups_26w
],
name="Customer Simple FeatureList"
)
Preview feature list¶
In [5]:
Copied!
# Check the primary entity of the feature list
simple_feature_list.primary_entity
# Check the primary entity of the feature list
simple_feature_list.primary_entity
Out[5]:
[<featurebyte.api.entity.Entity at 0x7fa8b001cac0> { 'name': 'customer', 'created_at': '2023-09-11T13:56:58.863000', 'updated_at': '2023-09-11T13:57:16.943000', 'description': None, 'serving_names': [ 'GROCERYCUSTOMERGUID' ], 'catalog_name': 'Grocery Dataset Tutorial' }]
In [6]:
Copied!
# Get observation table: 'Preview Table with 10 Customers'
preview_table = catalog.get_observation_table(
"Preview Table with 10 Customers"
).to_pandas()
# Get observation table: 'Preview Table with 10 Customers'
preview_table = catalog.get_observation_table(
"Preview Table with 10 Customers"
).to_pandas()
Downloading table |████████████████████████████████████████| 10/10 [100%] in 0.1
In [7]:
Copied!
# Preview simple_feature_list
simple_feature_list.preview(preview_table)
# Preview simple_feature_list
simple_feature_list.preview(preview_table)
Out[7]:
POINT_IN_TIME | GROCERYCUSTOMERGUID | CUSTOMER_Age_band | CUSTOMER_Latest_invoice_Amount | CUSTOMER_Count_of_invoice_14d | CUSTOMER_Avg_of_invoice_Amount_14d | CUSTOMER_Std_of_invoice_Amount_14d | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d | CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w | |
---|---|---|---|---|---|---|---|---|---|
0 | 2022-07-21 12:50:21 | dd0c74f0-9ba8-4ca9-bd84-8148095aa38a | 50-54 | 19.35 | 1 | 19.350000 | 0.000000 | 1.696160 | 0.681454 |
1 | 2022-11-10 15:57:20 | 0401635c-e6ab-4525-bb5d-00aba7f6d0c4 | 30-34 | 62.55 | 2 | 38.645000 | 23.905000 | 1.706156 | 0.751135 |
2 | 2022-08-14 19:00:14 | 54d86ef6-f9b8-40e2-9162-a60bd1b705db | 20-24 | 24.73 | 3 | 23.230000 | 8.492738 | 0.931491 | 0.417760 |
3 | 2022-07-12 08:02:04 | 4eb4ee84-ee13-4eec-9c26-61b6eb4ba35b | 65-69 | 46.00 | 4 | 25.005000 | 14.659950 | 1.567570 | 0.769076 |
4 | 2022-12-13 08:15:49 | 1e866814-e5a6-475d-87e3-b53377cc005b | 55-59 | 5.00 | 4 | 3.512500 | 1.064551 | -0.367178 | 0.532345 |
5 | 2023-04-26 16:52:34 | 48072b52-39cf-452c-8531-02cc4d0fc32e | 40-44 | 1.54 | 12 | 3.740000 | 5.472707 | -0.521815 | 0.839560 |
6 | 2023-03-01 11:31:00 | 081f111a-598b-43ae-a28a-3a5dc3d2a091 | 60-64 | 5.08 | 3 | 3.983333 | 1.759968 | -0.341443 | 0.704808 |
7 | 2023-01-19 16:33:33 | f3415165-754c-40b6-af17-06ef952a3fa1 | 20-24 | 2.28 | 15 | 10.937333 | 15.656393 | -0.585856 | 0.823894 |
8 | 2023-04-11 19:07:26 | d0ea14bf-038a-4ae5-887e-e2d4d68dd8f6 | 40-44 | 3.80 | 6 | 6.605000 | 6.384240 | -0.463131 | 0.866171 |
9 | 2023-04-10 08:24:27 | 69d8718e-8c4a-4264-8edf-e0ffc1ef4737 | 30-34 | 40.56 | 8 | 8.308750 | 12.339442 | 1.813072 | 0.856881 |
Save feature list¶
In [8]:
Copied!
# Save feature list
simple_feature_list.save()
# Add description
simple_feature_list.update_description("Simple feature list for the customer")
# Save feature list
simple_feature_list.save()
# Add description
simple_feature_list.update_description("Simple feature list for the customer")
Done! |████████████████████████████████████████| 100% in 6.8s (0.15%/s) Loading Feature(s) |████████████████████████████████████████| 7/7 [100%] in 1.5s
In [ ]:
Copied!