10. Create Features from SCD
Create Features from SCD Table¶
Three types of features can be declared from a Slowly Changing Dimension Table in FeatureByte:
- Lookup features: Current attributes or past attributes using an offset.
- Aggregate "As At" features: Aggregates as a particular point in time.
- Change features: Features derived for a change view tracking the changes in a column of the table.
We will declare here features from CLIENT_PROFILE and LOAN_STATUS that involve lookup features and change features.
Activate catalog¶
In [1]:
Copied!
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Loan Applications Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Loan Applications Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
14:10:11 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:10:11 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:10:11 | INFO | Using profile: tutorial INFO :featurebyte:Using profile: tutorial 14:10:11 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml INFO :featurebyte:Using configuration file at: /Users/gxav/.featurebyte/config.yaml 14:10:11 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) INFO :featurebyte:Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 14:10:11 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:10:11 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:10:11 | INFO | Catalog activated: Loan Applications Dataset SDK Tutorial INFO :featurebyte.api.catalog:Catalog activated: Loan Applications Dataset SDK Tutorial
Get view from table¶
In [2]:
Copied!
# Get view from CLIENT_PROFILE scd table.
client_profile_view = catalog.get_view("CLIENT_PROFILE")
# Get view from CLIENT_PROFILE scd table.
client_profile_view = catalog.get_view("CLIENT_PROFILE")
In [3]:
Copied!
# Get view from LOAN_STATUS scd table.
loan_status_view = catalog.get_view("LOAN_STATUS")
# Get view from LOAN_STATUS scd table.
loan_status_view = catalog.get_view("LOAN_STATUS")
In [4]:
Copied!
# Get view from PREVIOUS_APPLICATION event table.
previous_application_view = catalog.get_view("PREVIOUS_APPLICATION")
# Get view from PREVIOUS_APPLICATION event table.
previous_application_view = catalog.get_view("PREVIOUS_APPLICATION")
Create Lookup feature from CLIENT_PROFILE¶
In [5]:
Copied!
# Create lookup feature from BIRTHDATE column for Client entity.
client_birthdate = client_profile_view["BIRTHDATE"].as_feature("CLIENT_BIRTHDATE")
# Create lookup feature from BIRTHDATE column for Client entity.
client_birthdate = client_profile_view["BIRTHDATE"].as_feature("CLIENT_BIRTHDATE")
In [6]:
Copied!
# Create lookup feature from GENDER column for Client entity.
client_gender = client_profile_view["GENDER"].as_feature("CLIENT_GENDER")
# Create lookup feature from GENDER column for Client entity.
client_gender = client_profile_view["GENDER"].as_feature("CLIENT_GENDER")
In [7]:
Copied!
# Create lookup feature from EDUCATION_TYPE column for Client entity.
client_education_type = client_profile_view["EDUCATION_TYPE"].as_feature("CLIENT_EDUCATION_TYPE")
# Create lookup feature from EDUCATION_TYPE column for Client entity.
client_education_type = client_profile_view["EDUCATION_TYPE"].as_feature("CLIENT_EDUCATION_TYPE")
In [8]:
Copied!
# Create lookup feature from FAMILY_STATUS column for Client entity.
client_family_status = client_profile_view["FAMILY_STATUS"].as_feature("CLIENT_FAMILY_STATUS")
# Create lookup feature from FAMILY_STATUS column for Client entity.
client_family_status = client_profile_view["FAMILY_STATUS"].as_feature("CLIENT_FAMILY_STATUS")
In [9]:
Copied!
# Create lookup feature from ORGANIZATION_TYPE column for Client entity.
client_organization_type = client_profile_view["ORGANIZATION_TYPE"].as_feature("CLIENT_ORGANIZATION_TYPE")
# Create lookup feature from ORGANIZATION_TYPE column for Client entity.
client_organization_type = client_profile_view["ORGANIZATION_TYPE"].as_feature("CLIENT_ORGANIZATION_TYPE")
Derive Time since Lookup feature¶
In [10]:
Copied!
# Derive Age from the point-in-time and the date of birth.
client_age = ((fb.RequestColumn.point_in_time() - client_birthdate).dt.day / 365.25).floor()
# Name feature
client_age.name = "CLIENT_Age"
# Derive Age from the point-in-time and the date of birth.
client_age = ((fb.RequestColumn.point_in_time() - client_birthdate).dt.day / 365.25).floor()
# Name feature
client_age.name = "CLIENT_Age"
Create ratio column¶
In [11]:
Copied!
previous_application_view["AMT_APPLICATION To AMT_CREDIT"] = (
previous_application_view["AMT_APPLICATION"] / previous_application_view["AMT_CREDIT"]
)
previous_application_view["AMT_APPLICATION To AMT_CREDIT"] = (
previous_application_view["AMT_APPLICATION"] / previous_application_view["AMT_CREDIT"]
)
Join views¶
In [12]:
Copied!
# Join PREVIOUS_APPLICATION view to LOAN_STATUS view.
loan_status_view = loan_status_view.join(previous_application_view, rprefix="PriorApplication_")
# Join PREVIOUS_APPLICATION view to LOAN_STATUS view.
loan_status_view = loan_status_view.join(previous_application_view, rprefix="PriorApplication_")
Create features on changes in LOAN_STATUS termination_timestamp¶
In [13]:
Copied!
# Create change view from LOAN_STATUS table to track changes in termination_timestamp.
loan_status_termination_view = catalog.get_table("LOAN_STATUS").get_change_view(
track_changes_column="termination_timestamp"
)
cond = loan_status_termination_view["new_termination_timestamp"].isnull()
loan_status_termination_view = loan_status_termination_view[~cond]
# Create change view from LOAN_STATUS table to track changes in termination_timestamp.
loan_status_termination_view = catalog.get_table("LOAN_STATUS").get_change_view(
track_changes_column="termination_timestamp"
)
cond = loan_status_termination_view["new_termination_timestamp"].isnull()
loan_status_termination_view = loan_status_termination_view[~cond]
In [14]:
Copied!
# Join LOAN_STATUS view to loan_status_termination_view view.
loan_status_termination_view = loan_status_termination_view.join(loan_status_view, rprefix="Loan_")
# Join LOAN_STATUS view to loan_status_termination_view view.
loan_status_termination_view = loan_status_termination_view.join(loan_status_view, rprefix="Loan_")
Do window aggregation from loan_status_termination_view¶
See SDK reference for features
See SDK reference to groupby a view
See SDK reference to do aggregation over time
In [15]:
Copied!
# Group loan_status_termination_view view by Client entity (Loan_PriorApplication_ClientID).
loan_status_termination_view_by_client = loan_status_termination_view.groupby(["Loan_PriorApplication_ClientID"])
# Group loan_status_termination_view view by Client entity (Loan_PriorApplication_ClientID).
loan_status_termination_view_by_client = loan_status_termination_view.groupby(["Loan_PriorApplication_ClientID"])
In [16]:
Copied!
# Get Max of Loan_PriorApplication_AMT_APPLICATION To AMT_CREDIT for the Client over time.
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w = (
loan_status_termination_view_by_client.aggregate_over(
"Loan_PriorApplication_AMT_APPLICATION To AMT_CREDIT",
method="max",
feature_names=["CLIENT_Max_of_Loan_terminations_Loan_PriorApplication_AMT_APPLICATION_To_AMT_CREDITs_104w"],
windows=["104w"],
)["CLIENT_Max_of_Loan_terminations_Loan_PriorApplication_AMT_APPLICATION_To_AMT_CREDITs_104w"]
)
# Get Max of Loan_PriorApplication_AMT_APPLICATION To AMT_CREDIT for the Client over time.
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w = (
loan_status_termination_view_by_client.aggregate_over(
"Loan_PriorApplication_AMT_APPLICATION To AMT_CREDIT",
method="max",
feature_names=["CLIENT_Max_of_Loan_terminations_Loan_PriorApplication_AMT_APPLICATION_To_AMT_CREDITs_104w"],
windows=["104w"],
)["CLIENT_Max_of_Loan_terminations_Loan_PriorApplication_AMT_APPLICATION_To_AMT_CREDITs_104w"]
)
In [17]:
Copied!
fb.FeatureGroup(
[
client_organization_type,
client_education_type,
client_gender,
client_family_status,
client_age,
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w,
]
).save()
fb.FeatureGroup(
[
client_organization_type,
client_education_type,
client_gender,
client_family_status,
client_age,
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w,
]
).save()
Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.2s (0.16%/s) Loading Feature(s) |████████████████████████████████████████| 6/6 [100%] in 0.2s
Update feature type¶
In [18]:
Copied!
# Update feature type
client_organization_type.update_feature_type("categorical")
# Update feature type
client_organization_type.update_feature_type("categorical")
In [19]:
Copied!
# Update feature type
client_education_type.update_feature_type("categorical")
# Update feature type
client_education_type.update_feature_type("categorical")
In [20]:
Copied!
# Update feature type
client_gender.update_feature_type("categorical")
# Update feature type
client_gender.update_feature_type("categorical")
In [21]:
Copied!
# Update feature type
client_family_status.update_feature_type("categorical")
# Update feature type
client_family_status.update_feature_type("categorical")
In [22]:
Copied!
# Update feature type
client_age.update_feature_type("numeric")
# Update feature type
client_age.update_feature_type("numeric")
In [23]:
Copied!
# Update feature type
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w.update_feature_type("numeric")
# Update feature type
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w.update_feature_type("numeric")
Add description¶
In [24]:
Copied!
# Add description
client_organization_type.update_description("ORGANIZATION_TYPE of the Client")
# Add description
client_organization_type.update_description("ORGANIZATION_TYPE of the Client")
In [25]:
Copied!
# Add description
client_education_type.update_description("EDUCATION_TYPE of the Client")
# Add description
client_education_type.update_description("EDUCATION_TYPE of the Client")
In [26]:
Copied!
# Add description
client_gender.update_description("GENDER of the Client")
# Add description
client_gender.update_description("GENDER of the Client")
In [27]:
Copied!
# Add description
client_family_status.update_description("FAMILY_STATUS of the Client")
# Add description
client_family_status.update_description("FAMILY_STATUS of the Client")
In [28]:
Copied!
# Add description
client_age.update_description("Age of the Client.")
# Add description
client_age.update_description("Age of the Client.")
In [29]:
Copied!
# Add description
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w.update_description(
"Max of Loan terminations Loan_PriorApplication_AMT_APPLICATION To "
"AMT_CREDITs for the Client over a 104w period."
)
# Add description
client_max_of_loan_terminations_loan_priorapplication_amt_application_to_amt_credits_104w.update_description(
"Max of Loan terminations Loan_PriorApplication_AMT_APPLICATION To "
"AMT_CREDITs for the Client over a 104w period."
)