4. Add descriptions and Tag Semantics
Understanding the semantics of data fields and their tables is crucial for creating meaningful features and avoiding noise. However, data scientists often do this informally.
At FeatureByte, we've made this process more systematic. We map each data column to an ontology, determining the appropriate feature engineering techniques. This mapping is aided by FeatureByte Copilot, which uses Generative AI to analyze metadata from tables and columns. It proposes semantic tags for each column.
Clear descriptions of your data enhance Copilot's ability to make better recommendations and suggest relevant data aggregations, filters, and feature combinations during feature ideation. While Copilot can operate without these descriptions, they significantly enhance its recommendation quality.
Note
Table and column descriptions are automatically fetched from your Data Warehouse when they are available. If these descriptions are missing or incomplete, you have the option to edit and update them
Step 1: Update Tables Descriptions¶
From the menu, go to the 'Explore' section and access the Table catalog.
Check descriptions are as follows:
Table | Description |
---|---|
GROCERYCUSTOMER | Customer details, including their name, address, and date of birth |
GROCERYINVOICE | Grocery invoice details, containing the timestamp and the total amount of the invoice |
INVOICEITEMS | The grocery product item details within each invoice, including the quantity, total cost, discount applied, and product ID |
GROCERYPRODUCT | The product group description for each grocery product |
To edit the description of a table:
- Select the table from the Table Catalog.
- Go to the 'About' tab.
- Edit the description using the edit icon next to the description field.
Step 2: Update Columns Descriptions¶
To edit the description of a column in a table:
- Select the table from the Table Catalog.
- Go to the “Columns” tab.
- Use the edit icon to update column descriptions
Step 3: Tag Semantics¶
For each table:
- Select it from the Table Catalog.
- Go to the “Columns” tab.
- Click 'Run Semantic Type Detection'
- Review Suggestions
- Accept, adjust or do nothing
Note
If a column remains semantically untagged, FeatureByte Copilot will repeat the detection during feature ideation.