SDK Overview

The entire process from creating features to serving them can be performed using FeatureByte's Python SDK, available as a free, source available package. It is your gateway to a low-code experience to create, experiment, serve and manage your features under one roof.

Create with Ease:
  • Create and share state-of-the-art ML features effortlessly - let FeatureByte handle the complexity of time-aware SQL.
  • Reuse and tailor features for your specific use cases.
  • Bring your UDF to leverage the power of transformer models within FeatureByte.
Create and Save Feature
# Get view from catalog
invoice_view = catalog.get_view("GROCERYINVOICE")
# Declare features of total spent by customer
# in the past 7 and 28 days
customer_purchases = invoice_view.groupby(
    "GroceryCustomerGuid"
).aggregate_over(
    "Amount",
    method="sum",
    feature_names=[
        "CustomerTotalSpent_7d",
        "CustomerTotalSpent_28d"
    ],
    windows=['7d', '28d'],
)
customer_purchases.save()
Experiment Featurelist
# Get feature list from the catalog
feature_list = catalog.get_feature_list(
    "200 Features on Active Customers"
)
# Get an observation set from the catalog
observation_set = catalog.get_observation_table(
    "5M rows of active Customers in 2021-2022"
)
# Compute training data and
# store it in the feature store for reuse and audit
training = \
    feature_list.compute_historical_feature_table(
      observation_set,
      name="Training set to predict purchases next 2w"
    )
Experiment Boldly:
  • Instant access to historical features.
  • Innovate faster with live data experimentation at scale.
Serve Swiftly:
  • Deploy AI data pipelines and serve features with minimal latency.
  • Maintain data consistency between training and inferencing phases.
Deploy and Serve Feature List
# Get feature list from the catalog
feature_list = catalog.get_feature_list(
    "200 Features on Active Customers"
)
# Create deployment
deployment = feature_list.deploy(
    name="Features for customer purchases next 2w",
)
# Activate deployment
deployment.enable()
# Get shell script template for online serving
deployment.get_online_serving_code(language="sh")
Define Data Cleaning Policy on Table
# Get table from catalog
items_table = catalog.get_table("INVOICEITEMS")

# Discount must not be negative
items_table.Discount.update_critical_data_info(
    cleaning_operations=[
        fb.MissingValueImputation(
            imputed_value=0
        ),
        fb.ValueBeyondEndpointImputation(
            type="less_than",
            end_point=0,
            imputed_value=0
        ),
    ]
)
Manage Effectively:
  • Centralize and streamline your feature engineering processes.
  • Monitor and maintain the health of your feature pipelines.

Learn by Example

Discover FeatureByte's SDK with our step-by-step SDK tutorials. We'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.