Skip to content

FeatureGroup

A FeatureGroup object is a transient object designed for handling a collection of Feature objects. The object cannot be saved or added to a catalog.

Creating a FeatureGroup

A FeatureGroup is returned by the aggregate_over() method used to define Aggregate Over a Window and Cross Aggregate Over a Window features. In the following example, the FeatureGroup consists of two Feature objects: CustomerDiscounts_7d and CustomerDiscounts_28d.

# Group items by the column GroceryCustomerGuid that references the customer entity
items_by_customer = items_view.groupby("GroceryCustomerGuid")
# Declare features that measure the discount received by customer
customer_discounts = items_by_customer.aggregate_over(
    "Discount",
    method=fb.AggFunc.SUM,
    feature_names=["CustomerDiscounts_7d", "CustomerDiscounts_28d"],
    fill_value=0,
    windows=['7d', '28d']
)

A FeatureGroup object can also be created using its constructor that takes a list of Feature, FeatureList, and other FeatureGroup objects as input.

feature_list = catalog.get_feature_list(<feature_list_name>)
feature = catalog.get_feature(<feature_name>)
my_feature_group = FeatureGroup(
    [customer_discounts, feature_list, feature]
)

Dropping and Adding Features to a FeatureGroup

You can list the features within the FeatureGroup using the feature_names property:

my_feature_group.feature_names

Features can be removed from the FeatureGroup by specifying their names using the drop() method.

my_feature_group.drop("CustomerDiscounts_7d")

You can add a derived feature and assign it a name.

my_feature_group["CustomerDiscounts_Change_7dvs28d"] =  (
    Customer_discounts["CustomerDiscounts_7d"] / Customer_discounts["CustomerDiscounts_28d"]
)

Saving features within a FeatureGroup

Any unsaved features within the FeatureGroup can be saved using the save() method:

my_feature_group.save()

Previewing a FeatureGroup

A FeatureGroup can be previewed using the preview() method and a small observation set of up to 50 rows, which must include historical points-in-time and primary entity key values from the FeatureGroup.

import pandas as pd
observation_set = pd.DataFrame({
    'GROCERYCUSTOMERGUID': ["30e3fbe4-3cbe-4d51-b6ca-1f990ef9773d"],
    'POINT_IN_TIME': [pd.Timestamp("2022-12-17 12:12:40")]
})
display(my_feature_group.preview(observation_set))