Skip to content

featurebyte.SourceTable.create_observation_table

create_observation_table(
name: str,
sample_rows: Optional[int]=None,
columns: Optional[list[str]]=None,
columns_rename_mapping: Optional[dict[str, str]]=None,
context_name: Optional[str]=None,
skip_entity_validation_checks: Optional[bool]=False,
primary_entities: Optional[List[str]]=None,
target_column: Optional[str]=None
) -> ObservationTable

Description

Creates an ObservationTable from the SourceTable. When you specify the columns and the columns_rename_mapping parameters, make sure that the table has:

  • column(s) containing entity values with an accepted serving name.
  • a column containing historical points-in-time in UTC. The column name must be "POINT_IN_TIME".

Parameters

  • name: str
    Observation table name.

  • sample_rows: Optional[int]
    Optionally sample the source table to this number of rows before creating the observation table.

  • columns: Optional[list[str]]
    Include only these columns when creating the observation table. If None, all columns are included.

  • columns_rename_mapping: Optional[dict[str, str]]
    Rename columns in the source table using this mapping from old column names to new column names when creating the observation table. If None, no columns are renamed.

  • context_name: Optional[str]
    Context name for the observation table.

  • skip_entity_validation_checks: Optional[bool]
    default: False
    Skip entity validation checks when creating the observation table.

  • primary_entities: Optional[List[str]]
    List of primary entities for the observation table.

  • target_column: Optional[str]
    Name of the column in the observation table that stores the target values. The target column name must match an existing target namespace in the catalog. The data type and primary entities must match the those in the target namespace.

Returns

  • ObservationTable

Examples

>>> ds = fb.FeatureStore.get(<feature_store_name>).get_data_source()
>>> source_table = ds.get_source_table(
...   database_name="<data_base_name>",
...   schema_name="<schema_name>",
...   table_name=<table_name>
... )
>>> observation_table = source_table.create_observation_table(
...   name="<observation_table_name>",
...   sample_rows=desired_sample_size,
...   columns=[<timestamp_column_name>, <entity_column_name>],
...   columns_rename_mapping={
...     timestamp_column_name: "POINT_IN_TIME",
...     entity_column_name: <entity_serving_name>,
...   },
...   context_id=context_id,
... )