15. Manage feature life cycle
Manage Feature Life Cycle¶
In this section, you'll learn to adjust the readiness of features, ensuring that the system and its users know whether a feature is primed for production.
As data evolves, it’s essential to:
- Update the table's default settings to mirror these changes.
- Introduce a new version of the feature that syncs with the updated settings.
- Curate a fresh feature list version to harness these new default feature versions.
When there are modifications in the source table's availability or freshness, we can generate new feature versions with adjusted feature job settings. If data quality is compromised, a new feature version can be crafted with specific cleaning operations to mitigate the emerging quality challenges.
For undisturbed Machine Learning operations relying on these features, it's imperative to maintain the availability of old feature versions. This ensures that any ML tasks dependent on them continue smoothly.
In [1]:
Copied!
import featurebyte as fb
import pandas as pd
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
import pandas as pd
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
22:06:29 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 22:06:29 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 22:06:30 | WARNING | Remote SDK version (0.5.0.dev6) is different from local (0.5.0.dev1). Update local SDK to avoid unexpected behavior. 22:06:30 | INFO | No catalog activated. 22:06:30 | INFO | 6 feature lists, 31 features deployed 22:06:30 | INFO | Using profile: tutorial 22:06:30 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 22:06:30 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 22:06:31 | WARNING | Remote SDK version (0.5.0.dev6) is different from local (0.5.0.dev1). Update local SDK to avoid unexpected behavior. 22:06:31 | INFO | No catalog activated. 22:06:31 | INFO | 6 feature lists, 31 features deployed 22:06:32 | INFO | Catalog activated: Grocery Dataset Tutorial
Get feature from catalog¶
In [2]:
Copied!
# Get CUSTOMER_Avg_of_invoice_Amount_28d
customer_avg_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_28d")
# Get CUSTOMER_Avg_of_invoice_Amount_28d
customer_avg_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_28d")
Check feature definition file¶
In [3]:
Copied!
customer_avg_of_invoice_amount_28d.definition
customer_avg_of_invoice_amount_28d.definition
Out[3]:
# Generated by SDK version: 0.5.0.dev6
from bson import ObjectId
from featurebyte import ColumnCleaningOperation
from featurebyte import DisguisedValueImputation
from featurebyte import EventTable
from featurebyte import FeatureJobSetting
from featurebyte import ValueBeyondEndpointImputation
# event_table name: "GROCERYINVOICE"
event_table = EventTable.get_by_id(ObjectId("64ff1c910d5bfbfb21bce78a"))
event_view = event_table.get_view(
view_mode="manual",
drop_column_names=["record_available_at"],
column_cleaning_operations=[
ColumnCleaningOperation(
column_name="Amount",
cleaning_operations=[
DisguisedValueImputation(
imputed_value=None, disguised_values=[-99, -98]
),
ValueBeyondEndpointImputation(
type="less_than", end_point=0, imputed_value=0
),
ValueBeyondEndpointImputation(
type="greater_than", end_point=2000, imputed_value=2000
),
],
)
],
)
grouped = event_view.groupby(
by_keys=["GroceryCustomerGuid"], category=None
).aggregate_over(
value_column="Amount",
method="avg",
windows=["28d"],
feature_names=["CUSTOMER_Avg_of_invoice_Amount_28d"],
feature_job_setting=FeatureJobSetting(
blind_spot="120s", frequency="3600s", time_modulo_frequency="120s"
),
skip_fill_na=True,
)
feat = grouped["CUSTOMER_Avg_of_invoice_Amount_28d"]
output = feat
output.save(_id=ObjectId("64ff1d6a98b637caa7897492"))
Update Feature Readiness¶
In [4]:
Copied!
# List features and readiness
display(catalog.list_features())
# List features and readiness
display(catalog.list_features())
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 64ff1dc2183aab97b493af21 | CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer] | [customer] | 2023-09-11T14:01:53.167000 |
1 | 64ff1d9a2fa89ef7c7f4f5b8 | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:01:19.807000 |
2 | 64ff1d6a98b637caa7897494 | CUSTOMER_Std_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:30.353000 |
3 | 64ff1d6a98b637caa7897493 | CUSTOMER_Std_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:29.117000 |
4 | 64ff1d6a98b637caa7897492 | CUSTOMER_Avg_of_invoice_Amount_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:27.834000 |
5 | 64ff1d6a98b637caa7897491 | CUSTOMER_Avg_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:26.527000 |
6 | 64ff1d6a98b637caa789748f | CUSTOMER_Count_of_invoice_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:25.244000 |
7 | 64ff1d6a98b637caa789748d | CUSTOMER_Count_of_invoice_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:24.165000 |
8 | 64ff1d6a98b637caa789748c | CUSTOMER_Latest_invoice_Amount | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:23.056000 |
9 | 64ff1d4682feedb4dc913349 | CUSTOMER_Age_band | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:53.623000 |
10 | 64ff1d4182feedb4dc91333f | CUSTOMER_Age | INT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:43.782000 |
In [5]:
Copied!
# Update the readiness of the feature you want to share with others to Public Draft
customer_avg_of_invoice_amount_28d.update_readiness("PUBLIC_DRAFT")
# Update the readiness of the feature you want to share with others to Public Draft
customer_avg_of_invoice_amount_28d.update_readiness("PUBLIC_DRAFT")
In [6]:
Copied!
# Check readiness
print(
f" {customer_avg_of_invoice_amount_28d.name} readiness:",
customer_avg_of_invoice_amount_28d.readiness,
)
# Check readiness
print(
f" {customer_avg_of_invoice_amount_28d.name} readiness:",
customer_avg_of_invoice_amount_28d.readiness,
)
CUSTOMER_Avg_of_invoice_Amount_28d readiness: PUBLIC_DRAFT
Collect additional information on a feature¶
In [7]:
Copied!
# Get metadata on the feature
customer_avg_of_invoice_amount_28d.info()
# Get metadata on the feature
customer_avg_of_invoice_amount_28d.info()
Out[7]:
Feature
name | CUSTOMER_Avg_of_invoice_Amount_28d | ||||||||||||||||||||||||||||||||||||||||||
created_at | 2023-09-11 14:00:27 | ||||||||||||||||||||||||||||||||||||||||||
updated_at | 2023-09-11 14:06:38 | ||||||||||||||||||||||||||||||||||||||||||
description | None | ||||||||||||||||||||||||||||||||||||||||||
entities |
|
||||||||||||||||||||||||||||||||||||||||||
primary_entity |
|
||||||||||||||||||||||||||||||||||||||||||
tables |
|
||||||||||||||||||||||||||||||||||||||||||
default_version_mode | AUTO | ||||||||||||||||||||||||||||||||||||||||||
version_count | 1 | ||||||||||||||||||||||||||||||||||||||||||
catalog_name | Grocery Dataset Tutorial | ||||||||||||||||||||||||||||||||||||||||||
dtype | FLOAT | ||||||||||||||||||||||||||||||||||||||||||
primary_table |
|
||||||||||||||||||||||||||||||||||||||||||
default_feature_id | 64ff1d6a98b637caa7897492 | ||||||||||||||||||||||||||||||||||||||||||
version |
|
||||||||||||||||||||||||||||||||||||||||||
readiness |
|
||||||||||||||||||||||||||||||||||||||||||
table_feature_job_setting |
|
||||||||||||||||||||||||||||||||||||||||||
table_cleaning_operation |
|
||||||||||||||||||||||||||||||||||||||||||
versions_info | None | ||||||||||||||||||||||||||||||||||||||||||
metadata |
|
||||||||||||||||||||||||||||||||||||||||||
namespace_description | Avg of invoice Amount for the customer over a 28d period. |
Update Default Feature Job Setting at the table level¶
In [8]:
Copied!
# Get GROCERYINVOICE table
invoice_table = catalog.get_table("GROCERYINVOICE")
# Get GROCERYINVOICE table
invoice_table = catalog.get_table("GROCERYINVOICE")
In [9]:
Copied!
# Get current Default Feature Job Setting
invoice_table.default_feature_job_setting
# Get current Default Feature Job Setting
invoice_table.default_feature_job_setting
Out[9]:
FeatureJobSetting(blind_spot='120s', frequency='3600s', time_modulo_frequency='120s')
In [10]:
Copied!
# List past analysis
past_analysis = invoice_table.list_feature_job_setting_analysis()
# List past analysis
past_analysis = invoice_table.list_feature_job_setting_analysis()
In [11]:
Copied!
# Get past analysis
analysis_id = past_analysis.id.to_list()[0]
analysis = fb.FeatureJobSettingAnalysis.get_by_id(analysis_id)
# Get past analysis
analysis_id = past_analysis.id.to_list()[0]
analysis = fb.FeatureJobSettingAnalysis.get_by_id(analysis_id)
In [12]:
Copied!
# Backtest new setting
new_feature_job_setting = fb.FeatureJobSetting(
blind_spot='240s',
frequency=invoice_table.default_feature_job_setting.frequency,
time_modulo_frequency=invoice_table.default_feature_job_setting.time_modulo_frequency
)
backtest_result = analysis.backtest(feature_job_setting=new_feature_job_setting)
# Backtest new setting
new_feature_job_setting = fb.FeatureJobSetting(
blind_spot='240s',
frequency=invoice_table.default_feature_job_setting.frequency,
time_modulo_frequency=invoice_table.default_feature_job_setting.time_modulo_frequency
)
backtest_result = analysis.backtest(feature_job_setting=new_feature_job_setting)
Done! |████████████████████████████████████████| 100% in 6.7s (0.15%/s)
Feature Job Setting Analysis Report
Backtest Result
For the feature job setting:
- Frequency = 3600 s / Job time modulo frequency = 120 s / Blind spot = 240 s
The backtest found that all records would have been processed on time.
- Frequency = 3600 s / Job time modulo frequency = 120 s / Blind spot = 240 s
The backtest found that all records would have been processed on time.
In [13]:
Copied!
# Update Default Feature Job Setting
invoice_table.update_default_feature_job_setting(new_feature_job_setting)
# Update Default Feature Job Setting
invoice_table.update_default_feature_job_setting(new_feature_job_setting)
Note:
- This new Default Feature Job Setting will be used by default by any new feature using the table.
- In the Enterprise platform, an approval process is associated with the change in the Default Feature Job Setting and a request to create new versions of existing features is automatically triggered.
Update Default Cleaning Operations at the table level¶
In [14]:
Copied!
# Get GROCERYINVOICE table
invoice_table = catalog.get_table("GROCERYINVOICE")
# Get GROCERYINVOICE table
invoice_table = catalog.get_table("GROCERYINVOICE")
In [15]:
Copied!
# Get Info on columns
columns_info = pd.DataFrame(invoice_table.info(verbose=True)['columns_info'])
display(columns_info)
# Get Info on columns
columns_info = pd.DataFrame(invoice_table.info(verbose=True)['columns_info'])
display(columns_info)
name | dtype | entity | semantic | critical_data_info | description | |
---|---|---|---|---|---|---|
0 | GroceryInvoiceGuid | VARCHAR | invoice | event_id | None | Unique identifier of each row in the table, in... |
1 | GroceryCustomerGuid | VARCHAR | customer | None | None | Unique identifier for each customer, in GUID f... |
2 | Timestamp | TIMESTAMP | None | event_timestamp | None | The GMT timestamp of when this invoice transac... |
3 | tz_offset | VARCHAR | None | time_zone | None | The local timezone offset of the invoice event. |
4 | record_available_at | TIMESTAMP | None | record_creation_timestamp | None | A timestamp for when this row was added to the... |
5 | Amount | FLOAT | None | None | {'cleaning_operations': [{'imputed_value': Non... | The total amount of the invoice, including all... |
In [16]:
Copied!
# Get Current Cleaning Operation for Amount column
for info in columns_info.loc[columns_info.name=="Amount"]["critical_data_info"]:
print(info)
# Get Current Cleaning Operation for Amount column
for info in columns_info.loc[columns_info.name=="Amount"]["critical_data_info"]:
print(info)
{'cleaning_operations': [{'imputed_value': None, 'type': 'disguised', 'disguised_values': [-99, -98]}, {'imputed_value': 0, 'type': 'less_than', 'end_point': 0}, {'imputed_value': 2000, 'type': 'greater_than', 'end_point': 2000}]}
In [17]:
Copied!
# Update Cleaning Operations by adding -96 as a new disguised missing value
new_cleaning_operations = [
fb.DisguisedValueImputation(disguised_values=[-99, -98, -96], imputed_value=None),
fb.ValueBeyondEndpointImputation(
type="less_than", end_point=0, imputed_value=0
),
fb.ValueBeyondEndpointImputation(
type="greater_than", end_point=2000, imputed_value=2000
),
]
invoice_table["Amount"].update_critical_data_info(
cleaning_operations=new_cleaning_operations
)
# Update Cleaning Operations by adding -96 as a new disguised missing value
new_cleaning_operations = [
fb.DisguisedValueImputation(disguised_values=[-99, -98, -96], imputed_value=None),
fb.ValueBeyondEndpointImputation(
type="less_than", end_point=0, imputed_value=0
),
fb.ValueBeyondEndpointImputation(
type="greater_than", end_point=2000, imputed_value=2000
),
]
invoice_table["Amount"].update_critical_data_info(
cleaning_operations=new_cleaning_operations
)
Note:
- This new Default Cleaning Operations will be used by default by any new feature using the table column.
- In the Enterprise platform, an approval process is associated with the change in the Default Cleaning Operations and a request to create new versions of existing features using the table column is automatically triggered.
Change feature job setting and cleaning operations of a feature¶
In [18]:
Copied!
# Get feature CUSTOMER_Avg_of_invoice_Amount_14d
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
# Get feature CUSTOMER_Avg_of_invoice_Amount_14d
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
In [19]:
Copied!
# Get current feature job setting
customer_avg_of_invoice_amount_14d.info()["table_feature_job_setting"]
# Get current feature job setting
customer_avg_of_invoice_amount_14d.info()["table_feature_job_setting"]
Out[19]:
{'this': [{'table_name': 'GROCERYINVOICE', 'feature_job_setting': {'blind_spot': '120s', 'frequency': '3600s', 'time_modulo_frequency': '120s'}}], 'default': [{'table_name': 'GROCERYINVOICE', 'feature_job_setting': {'blind_spot': '120s', 'frequency': '3600s', 'time_modulo_frequency': '120s'}}]}
In [20]:
Copied!
# Get current cleaning operations
customer_avg_of_invoice_amount_14d.info()["table_cleaning_operation"]
# Get current cleaning operations
customer_avg_of_invoice_amount_14d.info()["table_cleaning_operation"]
Out[20]:
{'this': [{'table_name': 'GROCERYINVOICE', 'column_cleaning_operations': [{'column_name': 'Amount', 'cleaning_operations': [{'imputed_value': None, 'type': 'disguised', 'disguised_values': [-99, -98]}, {'imputed_value': 0, 'type': 'less_than', 'end_point': 0}, {'imputed_value': 2000, 'type': 'greater_than', 'end_point': 2000}]}]}], 'default': [{'table_name': 'GROCERYINVOICE', 'column_cleaning_operations': [{'column_name': 'Amount', 'cleaning_operations': [{'imputed_value': None, 'type': 'disguised', 'disguised_values': [-99, -98]}, {'imputed_value': 0, 'type': 'less_than', 'end_point': 0}, {'imputed_value': 2000, 'type': 'greater_than', 'end_point': 2000}]}]}]}
In [21]:
Copied!
# Deprecate current default version
customer_avg_of_invoice_amount_14d.update_readiness("DEPRECATED")
# Deprecate current default version
customer_avg_of_invoice_amount_14d.update_readiness("DEPRECATED")
In [22]:
Copied!
# Create new version
new_version = customer_avg_of_invoice_amount_14d.create_new_version(
table_feature_job_settings=[
fb.TableFeatureJobSetting(
table_name="GROCERYINVOICE",
feature_job_setting=new_feature_job_setting
)
],
table_cleaning_operations=[
fb.TableCleaningOperation(
table_name="GROCERYINVOICE",
column_cleaning_operations=[
fb.ColumnCleaningOperation(
column_name="Amount",
cleaning_operations=new_cleaning_operations
)
]
)
]
)
# Create new version
new_version = customer_avg_of_invoice_amount_14d.create_new_version(
table_feature_job_settings=[
fb.TableFeatureJobSetting(
table_name="GROCERYINVOICE",
feature_job_setting=new_feature_job_setting
)
],
table_cleaning_operations=[
fb.TableCleaningOperation(
table_name="GROCERYINVOICE",
column_cleaning_operations=[
fb.ColumnCleaningOperation(
column_name="Amount",
cleaning_operations=new_cleaning_operations
)
]
)
]
)
In [23]:
Copied!
# Check new version is the default
print(
f"version_name: {new_version.version} \n",
f"is the new version the new default? {new_version.is_default}"
)
# Check new version is the default
print(
f"version_name: {new_version.version} \n",
f"is the new version the new default? {new_version.is_default}"
)
version_name: V230911_1 is the new version the new default? True
In [24]:
Copied!
# Check you get the new version from the catalog by default
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
print(customer_avg_of_invoice_amount_14d.version)
# Check you get the new version from the catalog by default
customer_avg_of_invoice_amount_14d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_14d")
print(customer_avg_of_invoice_amount_14d.version)
V230911_1
In [25]:
Copied!
# List versions
customer_avg_of_invoice_amount_14d.list_versions()
# List versions
customer_avg_of_invoice_amount_14d.list_versions()
Out[25]:
id | name | version | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | is_default | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 64ff1f020290c0ac6f3178d6 | CUSTOMER_Avg_of_invoice_Amount_14d | V230911_1 | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:06:58.523000 | True |
1 | 64ff1d6a98b637caa7897491 | CUSTOMER_Avg_of_invoice_Amount_14d | V230911 | FLOAT | DEPRECATED | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:26.509000 | False |
In [26]:
Copied!
# Upgrade readiness of the new version
customer_avg_of_invoice_amount_14d.update_readiness("PRODUCTION_READY")
# Upgrade readiness of the new version
customer_avg_of_invoice_amount_14d.update_readiness("PRODUCTION_READY")
Upgrade readiness of a feature with discrepancies with table defaults¶
In [27]:
Copied!
# Get the feature CUSTOMER_Std_of_invoice_Amount_28d
customer_std_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_28d")
# Get the feature CUSTOMER_Std_of_invoice_Amount_28d
customer_std_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Std_of_invoice_Amount_28d")
In [28]:
Copied!
# Check that guardrails prevent you from upgrading the feature to production ready
# because of discrepancies between the feature settings and the default settings.
try:
customer_std_of_invoice_amount_28d.update_readiness("PRODUCTION_READY")
except Exception as e:
print(e)
# Check that guardrails prevent you from upgrading the feature to production ready
# because of discrepancies between the feature settings and the default settings.
try:
customer_std_of_invoice_amount_28d.update_readiness("PRODUCTION_READY")
except Exception as e:
print(e)
Discrepancies found between the promoted feature version you are trying to promote to PRODUCTION_READY, and the input table. {'feature_job_setting': {'data_source': FeatureJobSetting(blind_spot='240s', frequency='3600s', time_modulo_frequency='120s'), 'promoted_feature': FeatureJobSetting(blind_spot='120s', frequency='3600s', time_modulo_frequency='120s')}, 'cleaning_operations': {'data_source': [ColumnCleaningOperation(column_name='Amount', cleaning_operations=[DisguisedValueImputation(imputed_value=None, disguised_values=[-99, -98, -96]), ValueBeyondEndpointImputation(imputed_value=0, type=less_than, end_point=0), ValueBeyondEndpointImputation(imputed_value=2000, type=greater_than, end_point=2000)])], 'promoted_feature': [ColumnCleaningOperation(column_name='Amount', cleaning_operations=[DisguisedValueImputation(imputed_value=None, disguised_values=[-99, -98]), ValueBeyondEndpointImputation(imputed_value=0, type=less_than, end_point=0), ValueBeyondEndpointImputation(imputed_value=2000, type=greater_than, end_point=2000)])]}} Please fix these issues first before trying to promote your feature to PRODUCTION_READY.
In [29]:
Copied!
# Ignore the guardrails if you are ok with the feature setting
customer_std_of_invoice_amount_28d.update_readiness("PRODUCTION_READY", ignore_guardrails=True)
# Ignore the guardrails if you are ok with the feature setting
customer_std_of_invoice_amount_28d.update_readiness("PRODUCTION_READY", ignore_guardrails=True)
Create new version of a feature list¶
In [30]:
Copied!
# Get feature list from catalog
simple_feature_list = catalog.get_feature_list("Customer Simple FeatureList")
# Get feature list from catalog
simple_feature_list = catalog.get_feature_list("Customer Simple FeatureList")
Loading Feature(s) |████████████████████████████████████████| 7/7 [100%] in 1.0s
In [31]:
Copied!
# Check Fraction of default features
simple_feature_list.default_feature_fraction
# Check Fraction of default features
simple_feature_list.default_feature_fraction
Out[31]:
0.8571428571428571
In [32]:
Copied!
# Create new version
new_version = simple_feature_list.create_new_version()
# Create new version
new_version = simple_feature_list.create_new_version()
Loading Feature(s) |████████████████████████████████████████| 7/7 [100%] in 0.9s
In [33]:
Copied!
# Check Fraction of default features of new version
new_version.default_feature_fraction
# Check Fraction of default features of new version
new_version.default_feature_fraction
Out[33]:
1.0
In [34]:
Copied!
# Check new version is the default
print(
f"version_name: {new_version.version} \n",
f"is the new version the new default? {new_version.is_default}"
)
# Check new version is the default
print(
f"version_name: {new_version.version} \n",
f"is the new version the new default? {new_version.is_default}"
)
version_name: V230911_1 is the new version the new default? True
In [35]:
Copied!
# Get Default version from Catalog
simple_feature_list = catalog.get_feature_list("Customer Simple FeatureList")
print(simple_feature_list.version)
# Get Default version from Catalog
simple_feature_list = catalog.get_feature_list("Customer Simple FeatureList")
print(simple_feature_list.version)
Loading Feature(s) |████████████████████████████████████████| 7/7 [100%] in 1.0s V230911_1
In [36]:
Copied!
# List versions
simple_feature_list.list_versions()
# List versions
simple_feature_list.list_versions()
Out[36]:
id | name | version | online_frac | deployed | created_at | is_default | |
---|---|---|---|---|---|---|---|
0 | 64ff1f0fc0038ba1e42526d6 | Customer Simple FeatureList | V230911_1 | 0.0 | False | 2023-09-11T14:07:12.319000 | True |
1 | 64ff1dec72f1e0466e55f6a3 | Customer Simple FeatureList | V230911 | 0.0 | False | 2023-09-11T14:02:34.988000 | False |
In [37]:
Copied!
# Check Production Readiness
simple_feature_list.production_ready_fraction
# Check Production Readiness
simple_feature_list.production_ready_fraction
Out[37]:
1.0
Delete Draft features and feature lists¶
In [38]:
Copied!
# Get list of features together with their readiness
catalog.list_features()
# Get list of features together with their readiness
catalog.list_features()
Out[38]:
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 64ff1dc2183aab97b493af21 | CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer] | [customer] | 2023-09-11T14:01:53.167000 |
1 | 64ff1d9a2fa89ef7c7f4f5b8 | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:01:19.807000 |
2 | 64ff1d6a98b637caa7897494 | CUSTOMER_Std_of_invoice_Amount_28d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:30.353000 |
3 | 64ff1d6a98b637caa7897493 | CUSTOMER_Std_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:29.117000 |
4 | 64ff1d6a98b637caa7897492 | CUSTOMER_Avg_of_invoice_Amount_28d | FLOAT | PUBLIC_DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:27.834000 |
5 | 64ff1f020290c0ac6f3178d6 | CUSTOMER_Avg_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:26.527000 |
6 | 64ff1d6a98b637caa789748f | CUSTOMER_Count_of_invoice_28d | FLOAT | DRAFT | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:25.244000 |
7 | 64ff1d6a98b637caa789748d | CUSTOMER_Count_of_invoice_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:24.165000 |
8 | 64ff1d6a98b637caa789748c | CUSTOMER_Latest_invoice_Amount | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:23.056000 |
9 | 64ff1d4682feedb4dc913349 | CUSTOMER_Age_band | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:53.623000 |
10 | 64ff1d4182feedb4dc91333f | CUSTOMER_Age | INT | DRAFT | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:43.782000 |
In [39]:
Copied!
# Get draft features to delete
customer_count_of_invoice_28d = catalog.get_feature("CUSTOMER_Count_of_invoice_28d")
customer_age = catalog.get_feature("CUSTOMER_Age")
# Get draft features to delete
customer_count_of_invoice_28d = catalog.get_feature("CUSTOMER_Count_of_invoice_28d")
customer_age = catalog.get_feature("CUSTOMER_Age")
In [40]:
Copied!
# Create a feature list that uses the 2 features we want to delete
new_feature_list = fb.FeatureList(
[
customer_count_of_invoice_28d,
customer_age
],
name="New List"
)
new_feature_list.save()
# Create a feature list that uses the 2 features we want to delete
new_feature_list = fb.FeatureList(
[
customer_count_of_invoice_28d,
customer_age
],
name="New List"
)
new_feature_list.save()
Done! |████████████████████████████████████████| 100% in 3.5s (0.29%/s) Loading Feature(s) |████████████████████████████████████████| 2/2 [100%] in 0.8s
In [41]:
Copied!
# Check that guardrails prevent you from deleting a feature that is still in use by a feature list
try:
customer_age.delete()
except Exception as e:
print(e)
# Check that guardrails prevent you from deleting a feature that is still in use by a feature list
try:
customer_age.delete()
except Exception as e:
print(e)
("Feature is still in use by feature list(s). Please remove the following feature list(s) first:\n[{'id': '64ff1f1ddf3fe490245d9dc3', 'name': 'New List', 'version': 'V230911'}]", 'Failed to delete specified object.')
In [42]:
Copied!
# Delete feature list first
new_feature_list.delete()
# Delete feature list first
new_feature_list.delete()
In [43]:
Copied!
# Then delete the 2 draft features
customer_age.delete()
customer_count_of_invoice_28d.delete()
# Then delete the 2 draft features
customer_age.delete()
customer_count_of_invoice_28d.delete()
In [44]:
Copied!
# Check that guardrails prevent you from deleting a feature that is not a draft
customer_avg_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_28d")
try:
customer_avg_of_invoice_amount_28d.delete()
except Exception as e:
print(e)
# Check that guardrails prevent you from deleting a feature that is not a draft
customer_avg_of_invoice_amount_28d = catalog.get_feature("CUSTOMER_Avg_of_invoice_Amount_28d")
try:
customer_avg_of_invoice_amount_28d.delete()
except Exception as e:
print(e)
('Only feature with draft readiness can be deleted.', 'Failed to delete specified object.')
In [45]:
Copied!
# Instead deprecate it
customer_avg_of_invoice_amount_28d.update_readiness("DEPRECATED")
# Instead deprecate it
customer_avg_of_invoice_amount_28d.update_readiness("DEPRECATED")
In [46]:
Copied!
# Get updated list of features together with their readiness
catalog.list_features()
# Get updated list of features together with their readiness
catalog.list_features()
Out[46]:
id | name | dtype | readiness | online_enabled | tables | primary_tables | entities | primary_entities | created_at | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 64ff1dc2183aab97b493af21 | CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] | [INVOICEITEMS] | [customer] | [customer] | 2023-09-11T14:01:53.167000 |
1 | 64ff1d9a2fa89ef7c7f4f5b8 | CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:01:19.807000 |
2 | 64ff1d6a98b637caa7897494 | CUSTOMER_Std_of_invoice_Amount_28d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:30.353000 |
3 | 64ff1d6a98b637caa7897493 | CUSTOMER_Std_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:29.117000 |
4 | 64ff1d6a98b637caa7897492 | CUSTOMER_Avg_of_invoice_Amount_28d | FLOAT | DEPRECATED | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:27.834000 |
5 | 64ff1f020290c0ac6f3178d6 | CUSTOMER_Avg_of_invoice_Amount_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:26.527000 |
6 | 64ff1d6a98b637caa789748d | CUSTOMER_Count_of_invoice_14d | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:24.165000 |
7 | 64ff1d6a98b637caa789748c | CUSTOMER_Latest_invoice_Amount | FLOAT | PRODUCTION_READY | False | [GROCERYINVOICE] | [GROCERYINVOICE] | [customer] | [customer] | 2023-09-11T14:00:23.056000 |
8 | 64ff1d4682feedb4dc913349 | CUSTOMER_Age_band | VARCHAR | PRODUCTION_READY | False | [GROCERYCUSTOMER] | [GROCERYCUSTOMER] | [customer] | [customer] | 2023-09-11T13:59:53.623000 |
Concepts in this tutorial¶
- Feature versioning
- Default feature version
- Feature list versioning
- Default Feature Job Setting
- Feature Job Setting Recommendation
- Default Cleaning Operations
SDK reference for¶
- Feature
- FeatureList
- Feature.readiness
- Feature.update_readiness()
- Feature.info()
- EventTable.list_feature_job_setting_analysis()
- EventTable.create_new_feature_job_setting_analysis()
- EventTable.update_default_feature_job_setting()
- FeatureJobSettingAnalysis.get_by_id()
- FeatureJobSettingAnalysis.backtest()
- TableColumn.update_critical_data_info()
- Feature.create_new_version()
- Feature.list_versions()
- Feature.delete()
- FeatureList.create_new_version()
- FeatureList.list_versions()
- FeatureList.delete()
- FeatureList.production_ready_fraction
- FeatureList.default_feature_fraction
In [ ]:
Copied!