Skip to content

featurebyte.SourceTable.create_time_series_table

create_time_series_table(
name: str,
reference_datetime_column: str,
reference_datetime_schema: TimestampSchema,
time_interval: TimeInterval,
series_id_column: Optional[str],
record_creation_timestamp_column: Optional[str]=None,
description: Optional[str]=None
) -> TimeSeriesTable

Description

Creates and adds to the catalog an TimeSeriesTable object from a source table where each row indicates a specific business event measured at a particular moment.

To create an TimeSeriesTable, you need to identify the columns representing the series key and reference datetime..

After creation, the table can optionally incorporate additional metadata at the column level to further aid feature engineering. This can include identifying columns that identify or reference entities, providing information about the semantics of the table columns, specifying default cleaning operations, or furnishing descriptions of its columns.

Parameters

  • name: str
    The desired name for the new table.

  • reference_datetime_column: str
    The column that contains the reference datetime of the associated time series.

  • reference_datetime_schema: TimestampSchema
    The schema of the reference datetime column.

  • time_interval: TimeInterval
    The time interval of the time series.

  • series_id_column: Optional[str]
    The column that represents the unique identifier for each time series.

  • record_creation_timestamp_column: Optional[str]
    The optional column for the timestamp when a record was created.

  • description: Optional[str]
    The optional description for the new table.

Returns

  • TimeSeriesTable
    TimeSeriesTable created from the source table.

Examples

Create an time series table from a source table.

>>> # Register GROCERYSALES as a time series table
>>> source_table = ds.get_source_table(
...     database_name="spark_catalog", schema_name="GROCERY", table_name="GROCERYSALES"
... )
>>> sales_table = source_table.create_time_series_table(
...     name="GROCERYSALES",
...     reference_datetime_column="Date",
...     reference_datetime_schema=TimestampSchema(timezone="Etc/UTC"),
...     time_interval=TimeInterval(value=1, unit="DAY"),
...     series_id_column="StoreGuid",
...     record_creation_timestamp_column="record_available_at",
... )