Entity
An Entity object contains metadata on an entity type represented or referenced by tables within your data warehouse.
Registering an entity in the catalog¶
To register an entity in the catalog, specify:
- The name you want to use for the entity.
- A list of serving names.
A serving name is an accepted name for the unique identifier employed to identify the entity during preview or serving requests. Generally, the serving name for an entity is the name of the primary key (or natural key) of the table representing the entity. However, an entity may have multiple serving names for convenience.
You can register a new entity using the create_entity() method:
Tagging an Entity to Columns¶
Use the as_entity() method to tag the entity to columns that represent or reference it. This will facilitate joins and automatically link features to the corresponding entity.
# Tag the customer entity to GroceryInvoiceGuid in the customer table
customer_table = catalog.get_table("GROCERYCUSTOMER")
customer_table.GroceryCustomerGuid.as_entity("Customer")
# Tag the customer entity to GroceryInvoiceGuid in the grocery invoice table
invoice_table = catalog.get_table("GROCERYINVOICE")
invoice_table.GroceryCustomerGuid.as_entity("Customer")
To remove an entity tag, set the entity name to None.
Retrieving Information on the Primary Entity of a Feature Object¶
The primary entity of a feature defines the level of analysis for that feature. In addition, it determines the entities that can be utilized to serve the feature. A feature can be served by its primary entity or any descendant serving entities.
To retrieve information on the primary entity of a Feature object, use the primary_entity property.
Here is an example of a simple Lookup feature derived from a view of the customer table where the "Customer" entity was tagged to the natural key GroceryCustomerGuid.
# Get a view from the customer table
customer_view = catalog.get_view("GROCERYCUSTOMER")
# Extract operating system from BrowserUserAgent column
customer_view["OperatingSystemIsWindows"] = \
    customer_view.BrowserUserAgent.str.contains("Windows")
# Create a feature from the OperatingSystemIsWindows column
uses_windows = customer_view.OperatingSystemIsWindows.as_feature("UsesWindows")
# Display infomation on the primary entity of the new feature.
# This should return information on the Customer entity.
display(uses_windows.primary_entity)
Note that the primary_entity property is also available for the FeatureList object. It also determines the entities that can be utilized to serve the feature list.
Serving Feature Values¶
To materialize feature values, you can either utilize the primary entity of the feature or any descendant entities of its primary entity (the serving entities). In the context of preview and historical feature serving, you need to combine entity key values and point-in-time references. You can perform this either through an observation set or an observation table where the name of the column representing key values must be one of the serving names accepted for the entity.
You can preview a feature using the preview method as shown below:
import pandas as pd
observation_set = pd.DataFrame({
    'GROCERYCUSTOMERGUID': ["30e3fbe4-3cbe-4d51-b6ca-1f990ef9773d"],
    'POINT_IN_TIME': [pd.Timestamp("2022-12-17 12:12:40")]
})
display(uses_windows.preview(observation_set))
Accessing an Entity from the Catalog¶
You can access existing entities through the catalog using the list_entities() and get_entity() methods, as shown below:
# List entities in the catalog
catalog.list_entities()
# Retrieve an entity
customer_entity = catalog.get_entity("Customer")
To retrieve an Entity object using its Object ID, use the get_entity_by_id() method:
Listing Tables Representing or Referencing an Entity¶
List tables where a column has been tagged with the entity using the list_tables() method.
Listing Features Associated with an Entity¶
List features for which the entity is the primary entity using the list_features() method.
Listing Feature Lists Associated with an Entity¶
List feature lists containing features with the entity as their primary entity using the list_feature_lists() method.