Skip to content

featurebyte.view.GroupBy

class GroupBy(
obj: Union[EventView, ItemView, ChangeView, SCDView],
keys: Union[str, List[str]],
category: Union[str, NoneType]=None
)

Description

The groupby method of a view returns a GroupBy class that can be used to group data based on one or more columns representing entities (specified in the key parameter). Within each entity or group of entities, the GroupBy class applies aggregation function(s) to the data.

The grouping keys determine the primary entity for the declared features in the aggregation function.

Moreover, the groupby method's category parameter allows you to define a categorical column, which can be used to generate Cross Aggregate Features. These features involve aggregating data across categories of the categorical column, enabling the extraction of patterns in an entity across these categories. For instance, you can calculate the amount spent by a customer on each product category during a specific time period using this approach.

Parameters

  • obj: Union[EventView, ItemView, ChangeView, SCDView]

  • keys: Union[str, List[str]]

  • category: Union[str, NoneType]

Examples

Groupby for Aggregate features.

>>> items_view = catalog.get_view("INVOICEITEMS")
>>> # Group items by the column GroceryCustomerGuid that references the customer entity
>>> items_by_customer = items_view.groupby("GroceryCustomerGuid")
>>> # Declare features that measure the discount received by customer
>>> customer_discounts = items_by_customer.aggregate_over(
...   "Discount",
...   method=fb.AggFunc.SUM,
...   feature_names=["CustomerDiscounts_7d", "CustomerDiscounts_28d"],
...   fill_value=0,
...   windows=['7d', '28d']
... )

Groupby for Cross Aggregate features

>>> # Join product view to items view
>>> product_view = catalog.get_view("GROCERYPRODUCT")
>>> items_view = items_view.join(product_view)
>>> # Group items by the column GroceryCustomerGuid that references the customer entity
>>> # And use ProductGroup as the column to perform operations across
>>> items_by_customer_across_product_group = items_view.groupby(
...   by_keys = "GroceryCustomerGuid", category="ProductGroup"
... )
>>> # Cross Aggregate feature of the customer purchases across product group over the past 4 weeks
>>> customer_inventory_28d = items_by_customer_across_product_group.aggregate_over(
...   "TotalCost",
...   method=fb.AggFunc.SUM,
...   feature_names=["CustomerInventory_28d"],
...   windows=['28d']
... )