3. Register Entities
Registering entities represented in Loan Applications dataset¶
In FeatureByte, an entity models real-world objects and ideas. These entities often correspond to columns in database tables.
Taking our loan applications scenario as an example, we can view "Client", "New Application", "Prior Application", and "Loan" as entities.
To help FeatureByte identify these entities in the data and the columns that represent them, we'll be creating and tagging these entities in this tutorial.
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Loan Applications Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
14:06:37 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:06:37 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:06:37 | INFO | Using profile: tutorial INFO :featurebyte:Using profile: tutorial 14:06:37 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml INFO :featurebyte:Using configuration file at: /Users/gxav/.featurebyte/config.yaml 14:06:37 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) INFO :featurebyte:Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 14:06:37 | INFO | SDK version: 3.0.1.dev45 INFO :featurebyte:SDK version: 3.0.1.dev45 14:06:37 | INFO | No catalog activated. INFO :featurebyte:No catalog activated. 14:06:37 | INFO | Catalog activated: Loan Applications Dataset SDK Tutorial INFO :featurebyte.api.catalog:Catalog activated: Loan Applications Dataset SDK Tutorial 16:05:47 | WARNING | Remote SDK version (1.1.0.dev7) is different from local (1.1.0.dev1). Update local SDK to avoid unexpected behavior. 16:05:47 | INFO | No catalog activated. 16:05:47 | INFO | Catalog activated: Grocery Dataset Tutorial
Identify Entities in Your Data and Decide their Serving Name¶
When creating an entity, you'll need to define its serving name. This name acts as a unique identifier, particularly during preview or serving requests.
catalog.create_entity(name="New Application", serving_names=["SK_ID_CURR"])
catalog.create_entity(name="Client", serving_names=["ClientID"])
catalog.create_entity(name="BureauReportedCredit", serving_names=["SK_ID_BUREAU"])
catalog.create_entity(name="PriorApplication", serving_names=["APPLICATION_ID"])
catalog.create_entity(name="Installment", serving_names=["INSTALMENT_ID"])
catalog.create_entity(name="Loan", serving_names=["LOAN_ID"])
name | Loan |
created_at | 2025-06-02 05:47:03 |
updated_at | None |
description | None |
serving_names | ['LOAN_ID'] |
catalog_name | Loan Applications Dataset SDK Tutorial |
Tag Columns Representing Entities¶
Now that we've established the entities, it's time to guide FeatureByte in mapping these entities to the actual data in our tables.
application = catalog.get_table("NEW_APPLICATION")
application["ClientID"].as_entity("Client")
application["SK_ID_CURR"].as_entity("New Application")
client_profile = catalog.get_table("CLIENT_PROFILE")
client_profile["ClientID"].as_entity("Client")
bureau = catalog.get_table("BUREAU")
bureau["ClientID"].as_entity("Client")
bureau["SK_ID_BUREAU"].as_entity("BureauReportedCredit")
previous_application = catalog.get_table("PREVIOUS_APPLICATION")
previous_application["ClientID"].as_entity("Client")
previous_application["APPLICATION_ID"].as_entity("PriorApplication")
loan_status = catalog.get_table("LOAN_STATUS")
loan_status["LOAN_ID"].as_entity("Loan")
loan_status['APPLICATION_ID'].as_entity("PriorApplication")
installments_payments = catalog.get_table("INSTALLMENTS_PAYMENTS")
installments_payments["INSTALMENT_ID"].as_entity("Installment")
installments_payments["APPLICATION_ID"].as_entity("PriorApplication")
credit_card_balance = catalog.get_table("CREDIT_CARD_MONTHLY_BALANCE")
credit_card_balance["ClientID"].as_entity("Client")
Review Entities Relationships¶
Now, if we list the tables as we did in the previous tutorial, we'll notice that entities have been assigned to each table.
display(catalog.list_tables())
id | name | type | status | entities | created_at | |
---|---|---|---|---|---|---|
0 | 683d3a70f4d55cd61d526cee | CREDIT_CARD_MONTHLY_BALANCE | time_series_table | PUBLIC_DRAFT | [Client] | 2025-06-02T05:45:20.813000 |
1 | 683d3a6ef4d55cd61d526ced | LOAN_STATUS | scd_table | PUBLIC_DRAFT | [Loan, PriorApplication] | 2025-06-02T05:45:18.744000 |
2 | 683d3a6cf4d55cd61d526cec | PREVIOUS_APPLICATION | event_table | PUBLIC_DRAFT | [PriorApplication, Client] | 2025-06-02T05:45:16.825000 |
3 | 683d3a6af4d55cd61d526ceb | INSTALLMENTS_PAYMENTS | event_table | PUBLIC_DRAFT | [Installment, PriorApplication] | 2025-06-02T05:45:14.598000 |
4 | 683d3a69f4d55cd61d526cea | BUREAU | event_table | PUBLIC_DRAFT | [Client, BureauReportedCredit] | 2025-06-02T05:45:13.206000 |
5 | 683d3a66f4d55cd61d526ce9 | CLIENT_PROFILE | scd_table | PUBLIC_DRAFT | [Client] | 2025-06-02T05:45:11.014000 |
6 | 683d3a64f4d55cd61d526ce8 | NEW_APPLICATION | dimension_table | PUBLIC_DRAFT | [New Application, Client] | 2025-06-02T05:45:08.853000 |
We can also list entities separately:
display(catalog.list_entities())
id | name | serving_names | created_at | |
---|---|---|---|---|
0 | 683d3ad761061d4da44450d5 | Loan | [LOAN_ID] | 2025-06-02T05:47:03.194000 |
1 | 683d3ad661061d4da44450d4 | Installment | [INSTALMENT_ID] | 2025-06-02T05:47:03.049000 |
2 | 683d3ad661061d4da44450d3 | PriorApplication | [APPLICATION_ID] | 2025-06-02T05:47:02.918000 |
3 | 683d3ad661061d4da44450d2 | BureauReportedCredit | [SK_ID_BUREAU] | 2025-06-02T05:47:02.783000 |
4 | 683d3ad661061d4da44450d1 | Client | [ClientID] | 2025-06-02T05:47:02.638000 |
5 | 683d3ad661061d4da44450d0 | New Application | [SK_ID_CURR] | 2025-06-02T05:47:02.488000 |
And let's examine the relationships between entities, which FeatureByte has conveniently outlined for us:
display(catalog.list_relationships())
id | relationship_type | entity | related_entity | relation_table | relation_table_type | enabled | created_at | updated_at | |
---|---|---|---|---|---|---|---|---|---|
0 | 683d3ada954f0aa89942ada0 | child_parent | Installment | PriorApplication | INSTALLMENTS_PAYMENTS | event_table | True | 2025-06-02T05:47:06.230000 | None |
1 | 683d3ad9954f0aa89942ad98 | child_parent | Loan | PriorApplication | LOAN_STATUS | scd_table | True | 2025-06-02T05:47:05.761000 | None |
2 | 683d3ad9954f0aa89942ad8f | child_parent | PriorApplication | Client | PREVIOUS_APPLICATION | event_table | True | 2025-06-02T05:47:05.245000 | None |
3 | 683d3ad8954f0aa89942ad89 | child_parent | BureauReportedCredit | Client | BUREAU | event_table | True | 2025-06-02T05:47:04.739000 | None |
4 | 683d3ad7cd54c1e3f75b0dfc | child_parent | New Application | Client | NEW_APPLICATION | dimension_table | True | 2025-06-02T05:47:03.867000 | None |