Skip to content

featurebyte.SourceTable.create_observation_table

create_observation_table(
name: str,
sample_rows: Optional[int]=None,
columns: Optional[list[str]]=None,
columns_rename_mapping: Optional[dict[str, str]]=None,
context_name: Optional[str]=None,
skip_entity_validation_checks: Optional[bool]=False,
primary_entities: Optional[List[str]]=None
) -> ObservationTable

Description

Creates an ObservationTable from the SourceTable. When you specify the columns and the columns_rename_mapping parameters, make sure that the table has:

  • column(s) containing entity values with an accepted serving name.
  • a column containing historical points-in-time in UTC. The column name must be "POINT_IN_TIME".

Parameters

  • name: str
    Observation table name.

  • sample_rows: Optional[int]
    Optionally sample the source table to this number of rows before creating the observation table.

  • columns: Optional[list[str]]
    Include only these columns when creating the observation table. If None, all columns are included.

  • columns_rename_mapping: Optional[dict[str, str]]
    Rename columns in the source table using this mapping from old column names to new column names when creating the observation table. If None, no columns are renamed.

  • context_name: Optional[str]
    Context name for the observation table.

  • skip_entity_validation_checks: Optional[bool]
    default: False
    Skip entity validation checks when creating the observation table.

  • primary_entities: Optional[List[str]]
    List of primary entities for the observation table.

Returns

  • ObservationTable

Examples

>>> ds = fb.FeatureStore.get(<feature_store_name>).get_data_source()
>>> source_table = ds.get_source_table(
...   database_name="<data_base_name>",
...   schema_name="<schema_name>",
...   table_name=<table_name>
... )
>>> observation_table = source_table.create_observation_table(
...   name="<observation_table_name>",
...   sample_rows=desired_sample_size,
...   columns=[<timestamp_column_name>, <entity_column_name>],
...   columns_rename_mapping={
...     timestamp_column_name: "POINT_IN_TIME",
...     entity_column_name: <entity_serving_name>,
...   },
...   context_id=context_id,
... )