8b. Refine Ideation
In the previous tutorial, we explored the Automated Mode of Feature Ideation, where the system independently generated a comprehensive set of features.
Now, we focus on Semi-Automated Mode, which introduces an interactive layer to the ideation workflow. This mode empowers you to review, refine, and enhance the system's recommendations step by step, ensuring that the features align with your specific requirements and domain knowledge.
In this tutorial, you will learn how to:
- Incorporate custom transformations, such as embedding UDFs, to enrich feature engineering.
- Review and adjust the system's suggestions, from table selection to filters.
- Understand how Semi-Automated Mode balances automation efficiency with manual refinement flexibility.
Step 0: Add New User Defined Function (UDF) to the Catalog.¶
Before starting a new Feature Ideation, we will register an "embedding" UDF that leverages the Sentence-BERT (SBERT) transformer model. This UDF will be used to transform the Product Group column of the PRODUCTGROUP table into embeddings.
-
Navigate to the User Defined Function Catalog under the 'Formulate' section of the menu.
-
Create the "embedding" UDF by clicking
.
-
Confirm that the new UDF is registered and visible in the catalog.
Step 1: Create New Feature Ideation¶
-
Navigate to Feature Ideation under the 'Ideate' section of the menu and click
to start a new ideation process.
-
Edit the Feature Ideation name and description by clicking
.
Step 2: Start Semi-Automated Mode¶
-
Begin the workflow by clicking
.
-
After complete, the table selection results will be displayed for review.
Step 3: Review Table Selection¶
-
Click
to view detailed table selection results.
-
To open the detailed report:
- Click
next to the Ideation run name "Semi-Automated Mode".
- Then click
.
- Click
-
Return to the table selection screen and proceed by clicking
.
Step 4: Review Column Semantics Detection¶
-
Review the Column Semantics Detection results.
-
Click
to view the report.
-
Adjust semantic tags as needed. For example, assign the Gender column a semantic type under categorical/nominal_categorical/demographic_attribute/gender in FeatureByte's ontology.
-
Click
to continue.
Step 5: Review Transforms Detection¶
-
Once the Transforms Detection process finishes, review the results for the INVOICEITEMS table.
-
Click
to view the report.
-
Attempt to create a new Transform, 'Total_Cost / Quantity'.
- Open the Transform window by clicking
.
- Click
.
- Select the 'Ratio' operation.
- Choose the Total_Cost column as the numerator and the Quantity column as the denominator
---
---
- Generate a name and relevance by clicking
.
- Open the Transform window by clicking
-
Review the relevance explanation. If it is low (e.g., redundant with the existing column Unit Price), delete the transform.
Click
to delete the transform.
-
Close the Transform window by clicking
.
-
Click
to proceed.
Step 6: Review Filters Detection¶
-
Once the Filters Detection process finishes, review the results. In this example, no filters have been detected.
-
Create a new filter. Click
for the INVOICEITEMS table.
-
Select Filter Column. Choose Product Group as the filter column.
-
Complete the filter condition by specifying the filter values. This will open a new windown listing all elligible values.
-
Identify the most relevant values by clicking
.
-
Create automatically meaningful groups of values by clicking
. Select one group if any is relevant.
-
Finalize your value selection.
-
Generate filter name and relevance.
-
Check the relevance of the new filter.
-
Save the new filter by clicking
.
-
Click
to proceed.
Step 7: Review Feature Ideation Setup¶
-
Review the suggested setup.
-
Go to 'User Defined Function' section and click
to use the "embedding" UDF to transform the Product Group column of the PRODUCTGROUP table into embeddings.
-
Click
to complete Feature Ideation up to Feature Selection.
Step 8: Review the Feature Ideation Report¶
After the process completes, a feature selection will be displayed for your review.
Accessing the Detailed Report: To view the full report describing each step of the ideation process, click next to the Ideation run name "Semi-Automated Mode".
To visualize the full report with an indexed view in a new tab, click .
Step 9: Add Features to the Feature Catalog¶
- Go to the Features tab. Clear the search (if you used it) and any prior selection (if any) by clicking
- Select features in the feature list by clicking
.
-
Save the selected features into the Feature Catalog by clicking
. Call the feature list "SHAP selection with embedding".
Step 10: Run Rule-based Selection¶
- Change to All features by setting the dropdown list to
next to the Magic Ward.
- Start Feature Selection by clicking on the Magic Ward
.
- Select the Rule-Based mode. In this example, we want the top feature for each theme if it is part of top 100 features overall.
Once the selection is complete, it is added to the selections list.
Step 11: Add Rule-based selection to the Feature Catalog¶
- Go to the Features tab. Clear the search (if you used it) and any prior selection (if any) by clicking
- Select features in the feature list by clicking
.
- Save the selected features into the Feature Catalog by clicking
. Call the feature list "Top 1 per theme".