Skip to content

Catalog

A Catalog object operates as a centralized metadata repository for organizing tables, entities, features, and feature lists and other objects to facilitate feature reuse and serving for a specific domain. By employing a catalog, your team members can share, search, access, and reuse these assets.

Creating a Catalog

To create a Catalog, employ the create() class method:

catalog = fb.Catalog.create(
    catalog_name=<catalog_name>, feature_store_name=<feature_store_name>
)
If needed, you can rename your catalog for better clarity using the update_name() method:
catalog = catalog.update_name("Catalog for Grocery Business")

Accessing a Catalog

To list available catalogs, use the list() class method:

fb.Catalog.list()

To get a specific Catalog object without activating, use the get() class method:

catalog = fb.Catalog.get(<catalog_name>)

To activate a Catalog object, use the activate() class method:

catalog = fb.Catalog.activate(<catalog_name>)

To get the active Catalog object, use the get_active() class method:

catalog = fb.Catalog.get_active()

Note

Only one catalog can be active at a time.

Using a catalog object that is not active may return an error or unexpected behavior.

Adding Feature Assets to a Catalog

Ensure that you are working with the active catalog when using the get_active() class method and the info() method:

catalog = fb.Catalog.get_active()
catalog.info()
Upon creation, Table, Entity, and Relationship objects are automatically added to the active catalog. Feature and FeatureList objects with a new namespace must be saved using the save() method:
my_feature.save()
my_feature_list.save()

Listing Feature Assets in a Catalog

You can list tables, entities, relationships, features, and feature lists from a Catalog object and obtain detailed information about their properties.

Method Returns a DataFrame with each row representing Attributes Can be filtered by
list_tables() A table Table’s ID, name, type, status, associated entities and creation date Entity names associated with the table
list_entities() An entity type Entity type’s ID, name, serving names and creation date -
list_relationships() A relationship Relationship’s ID, type, linked entities names, relation table, status and creation date -
list_features() The default version of a feature Feature’s ID,name, type, related tables, related entities, readiness state, online availability and creation date Feature's primary entity or primary table names
list_feature_lists() The default version of a feature list Feature list’s ID, name, number of features, status, deployment status, feature readiness statistics, tables used by the features, related entities, and creation date Entity or table names associated with the feature list

Accessing Feature Assets from a Catalog

In the case of features and feature lists, you can retrieve tables, entities, relationships, features, and feature lists by their name, ID and version name.

Object to retrieve SDK code example
DataSource object data_source = catalog.get_data_source(<feature_store_name>)
Table object table = catalog.get_table(<table_name>)
or
table = catalog.get_table_by_id(<table_id>)
View object view = catalog.get_view(<table_name>)
Entity object entity = catalog.get_entity(<entity_name>)
or
entity = catalog.get_entity_by_id(<entity_id>)
Relationship object relationship = catalog.get_relationship_by_id(<relationship_id>)
Feature object representing the default version of a feature feature = catalog.get_feature(<feature_name>)
Feature object feature_version = catalog.get_feature(<feature_name>, version=<version_name>)
or
feature_version = catalog.get_feature_by_id(<feature_object_id>)
FeatureList object representing the default version of a feature list feature_list = catalog.get_feature_list(<feature_list_name>)
FeatureList object feature_list_version = catalog.get_feature_list(<feature_list_name>, version=<version_name>)
or
feature_list_version = catalog.get_feature_list_by_id(<feature_list_object_id>)
Analysis of a Table's data freshness and availability analysis = catalog.get_feature_job_setting_analysis_by_id(<analysis_id>)

Organizing Training and Test Data

Catalog your observation sets, training, and test data for easy access and lineage tracking.

  1. Observation sets are added to the active catalog when ObservationTable objects are used instead of pandas DataFrame.
  2. Training and test data are added to the catalog as HistoricalFeatureTable objects when the compute_historical_feature_table() method is used to materialize feature values.

To list or retrieve these tables from the catalog, use:

Handling Deployments and Batch Serving

The catalog stores deployment metadata when using the deploy() method. To list or retrieve Deployment objects, use list_deployments(), get_deployment(), and get_deployment_by_id().

For batch serving, BatchRequestTable and BatchFeatureTable objects are added to the catalog. To list or retrieve these tables, use:

Deleting Assets in a Catalog

Delete "DRAFT" state Feature objects and FeatureList objects from the catalog using the delete() method:

my_feature.delete()
my_feature_list.delete()
Remove HistoricalFeatureTable and BatchFeatureTable objects from the catalog anytime. In addition, delete ObservationTable and BatchRequestTable objects from the catalog if not related to HistoricalFeatureTable and BatchFeatureTable objects.

Other objects can only be deprecated in the catalog.