ObservationTable
An ObservationTable object is a representation of an observation set in the feature store. It combines historical points-in-time and entity values to make historical feature requests, usually for training and testing machine learning applications.
Creating ObservationTable Objects¶
To create an ObservationTable object, you have have multiple options: you can upload a CSV or Parquet file directy, or alternatively, you can utilize either a SourceTable object or a View object.
Note
The column with entity values must use an accepted serving name.
The column containing points-in-time must be labelled "POINT_IN_TIME" and should contain UTC timestamps.
Uploading a file:
To upload a file:
- Use the
upload()
method with the file path, table name, purpose, and primary entities specified. This file can be either a CSV or a Parquet file. - Ensure the file's column names include "POINT-IN-TIME" and the accepted serving names for primary entities.
observation_table = fb.ObservationTable.upload(
file_path="path/to/csv/file.csv",
name=<observation_table_name>,
purpose=fb.Purpose.PREVIEW,
primary_entities=[<primary_entity_name>],
)
Creating from a SourceTable object:
To create an ObservationTable object from a SourceTable object:
- Select the source table from the feature store.
- Use the
create_observation_table()
method, specifying columns, renaming if necessary, sample size, and table name.observation_table = source_table.create_observation_table( name=<observation_table_name>, sample_rows=<desired_sample_size>, columns=[<timestamp_column_name>, <entity_column_name>], columns_rename_mapping={ <timestamp_column_name>: "POINT_IN_TIME", <entity_column_name>: <entity_serving_name>, }, primary_entities=[<primary_entity_name>], )
Creating from a View object:
To create an ObservationTable object from a View object:
- Use the
create_observation_table()
method from the View object with similar parameters as above.observation_table = view.create_observation_table( name=<observation_table_name>, sample_rows=<desired_sample_size>, columns=[<timestamp_column_name>, <entity_column_name>], columns_rename_mapping={ <timestamp_column_name>: "POINT_IN_TIME", <entity_column_name>: <entity_serving_name>, }, primary_entities=[<primary_entity_name>], )
Additional Operations:
-
Download the table by using the
download()
method: -
Convert the table to a Pandas DataFrame by using the
to_pandas()
method: -
Delete the table, if not needed, with
delete()
method.
Linking an Observation Table to a Context¶
After creating an Observation Table, it can be linked to a Context to facilitate its reuse using the add_observation_table()
method.
You can also define an observation table to be used as the default preview / eda table for the Context using the update_default_eda_table()
and update_default_preview_table()
methods.
Finally, you can list observation tables associated with the Context using the list_observation_tables()
method.
Adding Target values to an Observation Table¶
Follow these steps to add target values to an observation table:
-
First get the relevant Target object:
-
Then use its
compute_target_table()
method to return a new ObservationTable object that includes target values. This method also stores the new table:
This will automatically associate the Observation Table with the Use Case linked to the source Observation Table's Context and the Target.
If needed, the table can be manually linked to a Use Case. To do this, use the add_observation_table()
method.
use_case = catalog.get_use_case("Credit Card Fraud Detection")
use_case.add_observation_table(<observation_table_name>)
Updating the Purpose of an Observation Table¶
To update the purpose of an ObservationTable object, use the update_purpose()
method:
To get the purpose of an ObservationTable object, use the purpose
property:
Listing and Retrieving ObservationTable Objects¶
To list the ObservationTable objects in the catalog, use the list_observation_tables()
method:
To retrieve a specific ObservationTable by its name from the catalog, use the get_observation_table()
method:
To retrieve a specific ObservationTable by its Object ID from the catalog, use the get_observation_table_by_id()
method: