SDK Overview
The entire process from creating features to serving them can be performed using FeatureByte's Python SDK, available as a free, source available package. It is your gateway to a low-code experience to create, experiment, serve and manage your features under one roof.
Create with Ease:
- Create and share state-of-the-art ML features effortlessly - let FeatureByte handle the complexity of time-aware SQL.
- Reuse and tailor features for your specific use cases.
- Bring your UDF to leverage the power of transformer models within FeatureByte.
Create and Save Feature
# Get view from catalog
invoice_view = catalog.get_view("GROCERYINVOICE")
# Declare features of total spent by customer
# in the past 7 and 28 days
customer_purchases = invoice_view.groupby(
"GroceryCustomerGuid"
).aggregate_over(
"Amount",
method="sum",
feature_names=[
"CustomerTotalSpent_7d",
"CustomerTotalSpent_28d"
],
windows=['7d', '28d'],
)
customer_purchases.save()
Experiment Featurelist
# Get feature list from the catalog
feature_list = catalog.get_feature_list(
"200 Features on Active Customers"
)
# Get an observation set from the catalog
observation_set = catalog.get_observation_table(
"5M rows of active Customers in 2021-2022"
)
# Compute training data and
# store it in the feature store for reuse and audit
training = \
feature_list.compute_historical_feature_table(
observation_set,
name="Training set to predict purchases next 2w"
)
Experiment Boldly:
- Instant access to historical features.
- Innovate faster with live data experimentation at scale.
Serve Swiftly:
- Deploy AI data pipelines and serve features with minimal latency.
- Maintain data consistency between training and inferencing phases.
Deploy and Serve Feature List
# Get feature list from the catalog
feature_list = catalog.get_feature_list(
"200 Features on Active Customers"
)
# Create deployment
deployment = feature_list.deploy(
name="Features for customer purchases next 2w",
)
# Activate deployment
deployment.enable()
# Get shell script template for online serving
deployment.get_online_serving_code(language="sh")
Define Data Cleaning Policy on Table
# Get table from catalog
items_table = catalog.get_table("INVOICEITEMS")
# Discount must not be negative
items_table.Discount.update_critical_data_info(
cleaning_operations=[
fb.MissingValueImputation(
imputed_value=0
),
fb.ValueBeyondEndpointImputation(
type="less_than",
end_point=0,
imputed_value=0
),
]
)
Manage Effectively:
- Centralize and streamline your feature engineering processes.
- Monitor and maintain the health of your feature pipelines.
Learn by Example
Discover FeatureByte's SDK with our step-by-step SDK tutorials. We'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.