featurebyte.SourceTable.create_time_series_table¶
create_time_series_table(
name: str,
reference_datetime_column: str,
reference_datetime_schema: TimestampSchema,
time_interval: TimeInterval,
series_id_column: Optional[str],
record_creation_timestamp_column: Optional[str]=None,
description: Optional[str]=None
) -> TimeSeriesTableDescription¶
Creates and adds to the catalog an TimeSeriesTable object from a source table where each row indicates a specific business event measured at a particular moment.
To create an TimeSeriesTable, you need to identify the columns representing the series key and reference datetime..
After creation, the table can optionally incorporate additional metadata at the column level to further aid feature engineering. This can include identifying columns that identify or reference entities, providing information about the semantics of the table columns, specifying default cleaning operations, or furnishing descriptions of its columns.
Parameters¶
- name: str
The desired name for the new table. - reference_datetime_column: str
The column that contains the reference datetime of the associated time series. - reference_datetime_schema: TimestampSchema
The schema of the reference datetime column. - time_interval: TimeInterval
The time interval of the time series. - series_id_column: Optional[str]
The column that represents the unique identifier for each time series. - record_creation_timestamp_column: Optional[str]
The optional column for the timestamp when a record was created. - description: Optional[str]
The optional description for the new table.
Returns¶
- TimeSeriesTable
TimeSeriesTable created from the source table.
Examples¶
Create an time series table from a source table.
>>> # Register GROCERYSALES as a time series table
>>> source_table = ds.get_source_table(
... database_name="spark_catalog", schema_name="GROCERY", table_name="GROCERYSALES"
... )
>>> sales_table = source_table.create_time_series_table(
... name="GROCERYSALES",
... reference_datetime_column="Date",
... reference_datetime_schema=TimestampSchema(timezone="Etc/UTC"),
... time_interval=TimeInterval(value=1, unit="DAY"),
... series_id_column="StoreGuid",
... record_creation_timestamp_column="record_available_at",
... )