6. Create Target
Create target¶
We want to predict a spending of active customers in next 2 weeks.
Let's create at target that measures sum of invoice Amount for the customer over the next 14d period.
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)
16:40:50 | INFO | Using configuration file at: /Users/viktor/.featurebyte/config.yaml 16:40:50 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 16:40:50 | INFO | SDK version: 0.6.0.dev121 16:40:50 | INFO | No catalog activated. 16:40:50 | INFO | 10 feature lists, 59 features deployed 16:40:50 | INFO | Using profile: tutorial 16:40:50 | INFO | Using configuration file at: /Users/viktor/.featurebyte/config.yaml 16:40:50 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 16:40:51 | INFO | SDK version: 0.6.0.dev121 16:40:51 | INFO | No catalog activated. 16:40:51 | INFO | 10 feature lists, 59 features deployed 16:40:51 | INFO | Catalog activated: Grocery Dataset Tutorial
As we already know, Amount column is from GROCERYINVOICE table, that's why we need to create a view from it:
In [2]:
Copied!
groceryinvoice_view = catalog.get_view("GROCERYINVOICE")
groceryinvoice_view = catalog.get_view("GROCERYINVOICE")
The target is a sum of the Amount column in next 14d:
In [3]:
Copied!
target = groceryinvoice_view\
.groupby(['GroceryCustomerGuid'])\
.forward_aggregate(
"Amount", method="sum",
target_name="CUSTOMER_Sum_of_invoice_Amount_next_14d",
window='14d',
fill_value=0
)
target = groceryinvoice_view\
.groupby(['GroceryCustomerGuid'])\
.forward_aggregate(
"Amount", method="sum",
target_name="CUSTOMER_Sum_of_invoice_Amount_next_14d",
window='14d',
fill_value=0
)
In order for a target (or a feature) to be recorded in catalog, we need to save it:
In [4]:
Copied!
target.save()
target.save()
Also we will update description of the target
In [5]:
Copied!
target.update_description(
"Sum of invoice Amount for the customer over the next 14d period."
)
target.update_description(
"Sum of invoice Amount for the customer over the next 14d period."
)
Target is created. We can check target's definition file, which provides explicit outline of all operations for declaration of the target. For example this definition includes implicit operations like all cleaning operations inherited from the table.
In [6]:
Copied!
target.definition
target.definition
Out[6]:
# Generated by SDK version: 0.6.0.dev121
from bson import ObjectId
from featurebyte import ColumnCleaningOperation
from featurebyte import DisguisedValueImputation
from featurebyte import EventTable
from featurebyte import ValueBeyondEndpointImputation
# event_table name: "GROCERYINVOICE"
event_table = EventTable.get_by_id(ObjectId("6564b7ebbeba6c193e0fe3bc"))
event_view = event_table.get_view(
view_mode="manual",
drop_column_names=["record_available_at"],
column_cleaning_operations=[
ColumnCleaningOperation(
column_name="Amount",
cleaning_operations=[
DisguisedValueImputation(
imputed_value=None, disguised_values=[-99.0, -98.0]
),
ValueBeyondEndpointImputation(
type="less_than", end_point=0.0, imputed_value=0.0
),
ValueBeyondEndpointImputation(
type="greater_than", end_point=2000.0, imputed_value=2000.0
),
],
)
],
)
target = event_view.groupby(
by_keys=["GroceryCustomerGuid"], category=None
).forward_aggregate(
value_column="Amount",
method="sum",
window="14d",
target_name="CUSTOMER_Sum_of_invoice_Amount_next_14d",
skip_fill_na=True,
)
target_1 = target.copy()
target_1[target.isnull()] = 0
target_1.name = "CUSTOMER_Sum_of_invoice_Amount_next_14d"
output = target_1
output.save(_id=ObjectId("6564b8859118708d78509816"))