Skip to content

featurebyte.SourceTable.create_snapshots_table

create_snapshots_table(
name: str,
snapshot_datetime_column: str,
snapshot_datetime_schema: TimestampSchema,
time_interval: TimeInterval,
series_id_column: Optional[str],
record_creation_timestamp_column: Optional[str]=None,
description: Optional[str]=None,
datetime_partition_column: Optional[str]=None,
datetime_partition_schema: Optional[TimestampSchema]=None
) -> SnapshotsTable

Description

Creates and adds to the catalog an SnapshotsTable object from a source table. To create a SnapshotsTable, you need to identify the columns representing the entity being snapshotted key and snapshot datetime.

After creation, the table can optionally incorporate additional metadata at the column level to further aid feature engineering. This can include identifying columns that identify or reference entities, providing information about the semantics of the table columns, specifying default cleaning operations, or furnishing descriptions of its columns.

Parameters

  • name: str
    The desired name for the new table.

  • snapshot_datetime_column: str
    Column representing the datetime of the snapshot.

  • snapshot_datetime_schema: TimestampSchema
    The schema of the snapshot datetime column. Timezone column is not supported.

  • time_interval: TimeInterval
    Specifies the frequency of snapshots. Note that only intervals defined with a single time unit (e.g., 1 day, 1 week) are supported.

  • series_id_column: Optional[str]
    Represents the entity being snapshotted. Must be unique within each snapshot datetime.

  • record_creation_timestamp_column: Optional[str]
    The optional column for the timestamp when a record was created.

  • description: Optional[str]
    The optional description for the new table.

  • datetime_partition_column: Optional[str]
    The optional column for the datetime column used for partitioning the snapshots table.

  • datetime_partition_schema: Optional[TimestampSchema]
    The optional timestamp schema for the datetime partition column.

Returns

  • SnapshotsTable
    SnapshotsTable created from the source table.

Examples

Create an snapshots table from a source table.

>>> # Register GROCERYPROFILE as a snapshots table
>>> source_table = ds.get_source_table(
...     database_name="spark_catalog", schema_name="GROCERY", table_name="GROCERYPROFILES"
... )
>>> sales_table = source_table.create_snapshots_table(
...     name="GROCERYPROFILES",
...     snapshot_datetime_column="Date",
...     snapshot_datetime_schema=TimestampSchema(timezone="Etc/UTC"),
...     time_interval=TimeInterval(value=1, unit="DAY"),
...     series_id_column="StoreGuid",
...     record_creation_timestamp_column="record_available_at",
... )

See Also