3. Register Entities
Registering entities represented in Credit Default dataset¶
In FeatureByte, an entity models real-world objects and ideas. These entities often correspond to columns in database tables.
Taking our credit default scenario as an example, we can view "Client", "New Application", "Prior Application", and "Consumer Loan" as entities.
To help FeatureByte identify these entities in the data and the columns that represent them, we'll be creating and tagging these entities in this tutorial.
import featurebyte as fb
# Set your profile to the tutorial environment
fb.use_profile("tutorial")
catalog_name = "Credit Default Dataset SDK Tutorial"
catalog = fb.Catalog.activate(catalog_name)
16:38:04 | WARNING | Service endpoint is inaccessible: http://featurebyte-server:8088/ 16:38:04 | INFO | Using profile: tutorial 16:38:04 | INFO | Using configuration file at: /Users/gxav/.featurebyte/config.yaml 16:38:04 | INFO | Active profile: tutorial (https://tutorials.featurebyte.com/api/v1) 16:38:04 | INFO | SDK version: 2.1.0.dev113 16:38:04 | INFO | No catalog activated. 16:38:04 | INFO | Catalog activated: Credit Default Dataset SDK Tutorial 16:05:47 | WARNING | Remote SDK version (1.1.0.dev7) is different from local (1.1.0.dev1). Update local SDK to avoid unexpected behavior. 16:05:47 | INFO | No catalog activated. 16:05:47 | INFO | Catalog activated: Grocery Dataset Tutorial
Identify Entities in Your Data and Decide their Serving Name¶
When creating an entity, you'll need to define its serving name. This name acts as a unique identifier, particularly during preview or serving requests.
catalog.create_entity(name="New Application", serving_names=["NEW_APPLICATION_ID"])
catalog.create_entity(name="Client", serving_names=["CLIENT_ID"])
catalog.create_entity(name="Prior Application", serving_names=["PRIOR_APPLICATION_ID"])
catalog.create_entity(name="Consumer Loan", serving_names=["CONSUMER_LOAN_ID"])
name | Consumer Loan |
created_at | 2025-03-01 08:38:06 |
updated_at | None |
description | None |
serving_names | ['CONSUMER_LOAN_ID'] |
catalog_name | Credit Default Dataset SDK Tutorial |
Tag Columns Representing Entities¶
Now that we've established the entities, it's time to guide FeatureByte in mapping these entities to the actual data in our tables.
application = catalog.get_table("NEW_APPLICATION")
application["CLIENT_ID"].as_entity("Client")
application["APPLICATION_ID"].as_entity("New Application")
prior_applications = catalog.get_table("PRIOR_APPLICATIONS")
prior_applications["CLIENT_ID"].as_entity("Client")
prior_applications["APPLICATION_ID"].as_entity("Prior Application")
consumer_loans = catalog.get_table("CONSUMER_LOAN_STATUS")
consumer_loans["LOAN_ID"].as_entity("Consumer Loan")
consumer_loans["CLIENT_ID"].as_entity("Client")
consumer_installments = catalog.get_table("CONSUMER_INSTALLMENTS")
consumer_installments["LOAN_ID"].as_entity("Consumer Loan")
Review Entities Relationships¶
Now, if we list the tables as we did in the previous tutorial, we'll notice that entities have been assigned to each table.
display(catalog.list_tables())
id | name | type | status | entities | created_at | |
---|---|---|---|---|---|---|
0 | 67c2c752924afe7a79ec6f27 | CONSUMER_INSTALLMENTS | time_series_table | PUBLIC_DRAFT | [Consumer Loan] | 2025-03-01T08:37:38.763000 |
1 | 67c2c750924afe7a79ec6f26 | CONSUMER_LOAN_STATUS | scd_table | PUBLIC_DRAFT | [Consumer Loan, Client] | 2025-03-01T08:37:36.903000 |
2 | 67c2c74e924afe7a79ec6f25 | PRIOR_APPLICATIONS | event_table | PUBLIC_DRAFT | [Prior Application, Client] | 2025-03-01T08:37:34.829000 |
3 | 67c2c74c924afe7a79ec6f24 | NEW_APPLICATION | dimension_table | PUBLIC_DRAFT | [New Application, Client] | 2025-03-01T08:37:32.806000 |
We can also list entities separately:
display(catalog.list_entities())
id | name | serving_names | created_at | |
---|---|---|---|---|
0 | 67c2c76dfd2556fdb5ea9b92 | Consumer Loan | [CONSUMER_LOAN_ID] | 2025-03-01T08:38:06.210000 |
1 | 67c2c76dfd2556fdb5ea9b91 | Prior Application | [PRIOR_APPLICATION_ID] | 2025-03-01T08:38:05.905000 |
2 | 67c2c76dfd2556fdb5ea9b90 | Client | [CLIENT_ID] | 2025-03-01T08:38:05.572000 |
3 | 67c2c76cfd2556fdb5ea9b8f | New Application | [NEW_APPLICATION_ID] | 2025-03-01T08:38:05.280000 |
And let's examine the relationships between entities, which FeatureByte has conveniently outlined for us:
display(catalog.list_relationships())
id | relationship_type | entity | related_entity | relation_table | relation_table_type | enabled | created_at | updated_at | |
---|---|---|---|---|---|---|---|---|---|
0 | 67c2c7714e08d83e213816d6 | child_parent | Consumer Loan | Client | CONSUMER_LOAN_STATUS | scd_table | True | 2025-03-01T08:38:09.286000 | None |
1 | 67c2c7704e08d83e213816cd | child_parent | Prior Application | Client | PRIOR_APPLICATIONS | event_table | True | 2025-03-01T08:38:08.356000 | None |
2 | 67c2c76f3df413286793fb16 | child_parent | New Application | Client | NEW_APPLICATION | dimension_table | True | 2025-03-01T08:38:07.444000 | None |