Skip to content

ObservationTable

An ObservationTable object is a representation of an observation set in the feature store. It combines historical points-in-time and entity values to make historical feature requests, usually for training and testing machine learning applications.

Creating ObservationTable Objects

To create an ObservationTable object, you have have multiple options: you can upload a CSV or Parquet file directy, or alternatively, you can utilize either a SourceTable object or a View object.

Note

The column with entity values must use an accepted serving name.

The column containing points-in-time must be labelled "POINT_IN_TIME" and should contain UTC timestamps.

Uploading a file:

To upload a file:

  1. Use the upload() method with the file path, table name, purpose, and primary entities specified. This file can be either a CSV or a Parquet file.
  2. Ensure the file's column names include "POINT-IN-TIME" and the accepted serving names for primary entities.
observation_table = fb.ObservationTable.upload(
    file_path="path/to/csv/file.csv",
    name=<observation_table_name>,
    purpose=fb.Purpose.PREVIEW,
    primary_entities=[<primary_entity_name>],
)

Creating from a SourceTable object:

To create an ObservationTable object from a SourceTable object:

  1. Select the source table from the feature store.
    ds = fb.FeatureStore.get(<feature_store_name>).get_data_source()
    source_table = ds.get_source_table(
        database_name=<data_base_name>,
        schema_name=<schema_name>,
        table_name=<table_name>
    )
    
  2. Use the create_observation_table() method, specifying columns, renaming if necessary, sample size, and table name.
    observation_table = source_table.create_observation_table(
        name=<observation_table_name>,
        sample_rows=<desired_sample_size>,
        columns=[<timestamp_column_name>, <entity_column_name>],
        columns_rename_mapping={
            <timestamp_column_name>: "POINT_IN_TIME",
            <entity_column_name>: <entity_serving_name>,
        },
        primary_entities=[<primary_entity_name>],
    )
    

Creating from a View object:

To create an ObservationTable object from a View object:

  1. Use the create_observation_table() method from the View object with similar parameters as above.
    observation_table = view.create_observation_table(
        name=<observation_table_name>,
        sample_rows=<desired_sample_size>,
        columns=[<timestamp_column_name>, <entity_column_name>],
        columns_rename_mapping={
            <timestamp_column_name>: "POINT_IN_TIME",
            <entity_column_name>: <entity_serving_name>,
        },
        primary_entities=[<primary_entity_name>],
    )
    

Additional Operations:

  • Download the table by using the download() method:

    observation_table.download()
    

  • Convert the table to a Pandas DataFrame by using the to_pandas() method:

    observation_table.to_pandas()
    

  • Delete the table, if not needed, with delete() method.

    observation_table.delete()
    

Linking an Observation Table to a Context

After creating an Observation Table, it can be linked to a Context to facilitate its reuse using the add_observation_table() method.

context = catalog.get_context("context")
context.add_observation_table(<observation_table_name>)

You can also define an observation table to be used as the default preview / eda table for the Context using the update_default_eda_table() and update_default_preview_table() methods.

context.update_default_eda_table(<observation_table_name>)
context.update_default_preview_table(<observation_table_name>)

Finally, you can list observation tables associated with the Context using the list_observation_tables() method.

context.list_observation_tables()

Adding Target values to an Observation Table

Follow these steps to add target values to an observation table:

  1. First get the relevant Target object:

    my_target = catalog.get_target(<target_name>)
    

  2. Then use its compute_target_table() method to return a new ObservationTable object that includes target values. This method also stores the new table:

    observation_table_with_target = my_target.compute_target_table(
        observation_table,
        observation_table_name='Customer Purchase next 2w'
    )
    

This will automatically associate the Observation Table with the Use Case linked to the source Observation Table's Context and the Target.

If needed, the table can be manually linked to a Use Case. To do this, use the add_observation_table() method.

use_case = catalog.get_use_case("Credit Card Fraud Detection")
use_case.add_observation_table(<observation_table_name>)

Updating the Purpose of an Observation Table

To update the purpose of an ObservationTable object, use the update_purpose() method:

observation_table_with_target.update_purpose("training")

To get the purpose of an ObservationTable object, use the purpose property:

observation_table_with_target.update_purpose("training")

Listing and Retrieving ObservationTable Objects

To list the ObservationTable objects in the catalog, use the list_observation_tables() method:

catalog.list_observation_tables()

To retrieve a specific ObservationTable by its name from the catalog, use the get_observation_table() method:

observation_table = catalog.get_observation_table(<observation_table_name>)

To retrieve a specific ObservationTable by its Object ID from the catalog, use the get_observation_table_by_id() method:

observation_table = catalog.get_observation_table_by_id(<observation_table_id>)