2. Register Tables
Our catalog is created and we can start registering tables in it.
Step 1: Select Data¶
We'll utilize the four tables of our Grocery Dataset:
Table | Description |
---|---|
GROCERYINVOICE | Grocery invoice details, containing the timestamp and the total amount of the invoice |
INVOICEITEMS | The grocery product item details within each invoice, including the quantity, total cost, discount applied, and product ID |
GROCERYCUSTOMER | Customer details, including their name, address, and date of birth |
GROCERYPRODUCT | The product group description for each grocery product |
Step 2: Locate Your data¶
From the menu, go to the Explore section and access the Source Tables.
You will find the four tables under the "DEMO_DATASETS" database and the "GROCERY" schema.
Step 3: Understand Table Types¶
For accurate feature derivation, FeatureByte needs to recognize the roles of different tables.
Each table should be assigned a specific type based on its structure and purpose:
-
GROCERYINVOICE --> Event table.
Why Event Table?
The table records invoices of the Customer purchase events, making it suitable for an Event Table designation.
-
INVOICEITEMS --> Item table.
Why Item Table?
The table records breakdown of the invoices of the Customer purchase events, making it suitable for an Item Table designation.
-
GROCERYCUSTOMER --> Slowly Changing Dimension (SCD) table.
Why Slowly Changing Dimension (SCD) table?
The table tracks the Customer profile and dynamic fields that change over time, making it a Slowly Changing Dimension (SCD) Table.
-
GROCERYPRODUCT --> Dimension table.
Why Dimension table?
The table provides a static mapping of products with their corresponding product group , making it a Dimension Table.
Step 4: Register the GROCERYINVOICE table as an Event Table¶
-
Select the GROCERYINVOICE table.
-
Click on
- Set the table type as Event Table.
-
Identify the Event Timestamp Column.
The Event Timestamp Column must be a UTC Timestamp or a Snowflake TIMESTAMP_TZ
The Event Timestamp Column must be a UTC Timestamp or a Snowflake TIMESTAMP_TZ. Support for string-based datetime format and local time records will be added soon.
-
Specify the Event ID Column if applicable.
-
Select the Event Time Zone Offset if applicable.
Local Date Parts
The Time Zone offset is used to extract date parts (e.g., hour of the day, weekday) in local time. Support for Daylight saving time (DST) will be added soon.
-
Specify the Record Creation Timestamp Column if applicable.
-
Establish a Default Feature Job Setting, either automatically (if a Record Creation Timestamp Column is provided) or manually.
Step 5: Register the INVOICEITEMS table as an Item Table¶
-
Select the INVOICEITEMS table.
-
Click on
- Set the table type as Item Table.
- Identify the Event table it is associated with and the key columns (Item ID Column
- Set Event ID Column).
Step 6: Register the GROCERYCUSTOMER table as a SCD Table¶
-
Select the GROCERYCUSTOMER table.
-
Click on
- Set the table type as Slowly Changing Dimension Table.
-
Identify its Natural Key Column, Surrogate Key Column and Current Flag Column if applicable.
-
Specify the Effective Timestamp Column and its Schema. Ensure the following:
- If the column is recorded as a string, specify its string-based datetime format.
-
Indicate whether the Effective Timestamp is recorded in UTC or local time.
- If recorded in local time, you must specify its time zone component.
-
Specify End Timestamp Column and its Schema if applicable. Ensure the following:
- If the column is recorded as a string, specify its string-based datetime format.
-
Indicate whether the End Timestamp is recorded in UTC or local time.
- If recorded in local time, you must specify its time zone component.
-
Specify the Record Creation Timestamp Column if applicable.
Step 7: Register the GROCERYPRODUCT table¶
-
Select the GROCERYPRODUCT table.
-
Click on
- Set the table type as Dimension Table.
- Identify its Dimension ID Column.
Step 8: Review Registered Tables¶
Verify the registration by checking the Table Catalog under the 'Explore' section.