8b. Refine Ideation
In the previous tutorial, we explored the Automated Mode of Feature Ideation, where the system independently generated a comprehensive set of features.
Now, we turn our focus to the Semi-Automated Mode, which introduces an interactive layer to the ideation workflow. This mode empowers you to review, refine, and enhance the system's recommendations step by step, ensuring that the features align with your specific requirements and domain knowledge.
Through this tutorial, you will:
- Learn how to incorporate custom transformations, such as embedding UDFs, to enrich feature engineering.
- Review and adjust the system's suggestions, from table selection to filters.
- Understand how Semi-Automated Mode combines the efficiency of automation with the flexibility of manual refinement.
Step 0: Add New User Defined Function (UDF) to the Catalog.¶
Before starting a new Feature Ideation, we will register an "embedding" UDF that leverages the Sentence-BERT (SBERT) transformer model. This UDF will be used to transform the Product Group column of the PRODUCTGROUP table into embeddings.
-
Navigate to the User Defined Function Catalog under the 'Formulate' section of the menu.
-
Create the "embedding" UDF by clicking .
-
Confirm that the new UDF is registered and visible in the catalog.
Step 1: Create New Feature Ideation¶
-
Navigate to Feature Ideation from the 'Ideate' section of the menu.
-
Click to start a new ideation process.
-
Edit the Feature Ideation name and description by clicking .
Step 2: Start Semi-Automated Mode¶
-
Begin the workflow by clicking .
-
After complete, the table selection results will be displayed for review.
Step 3: Review Table Selection¶
-
Click to view details on the table selection.
-
To open the detailed report with an indexed view in a new tab:
- Click next to the Ideation run name "Semi-Automated Mode".
- Then click .
-
After reviewing, return to the table selection screen.
-
Keep the selection unchanged and proceed to the next step by clicking .
Step 4: Review Column Semantics Detection¶
-
After the Column Semantics Detection step completes, review the results.
-
Click to view the report.
-
Adjust semantic tags as needed. For example, assign the Gender column a semantic type under categorical/nominal_categorical/demographic_attribute/gender in FeatureByte's ontology.
-
Click to continue.
Step 5: Review Transforms Detection¶
-
Once the Transforms Detection process finishes, review the results for the INVOICEITEMS table.
-
Click to view the report.
-
Attempt to create a new Transform, 'Total_Cost / Quantity'.
- Open the Transform window by clicking .
- Click .
- Select the 'Ratio' operation.
- Choose the Total_Cost column as the numerator and the Quantity column as the denominator
- Generate a name and relevance by clicking .
-
Review the relevance explanation. If it is low (e.g., redundant with the existing column Unit Price), delete the transform. Click to delete the transform.
-
Close the Transform window by clicking .
-
Click to proceed.
Step 6: Review Filters Detection¶
-
Once the Filters Detection process finishes, review the results. In this example, no filters have been detected.
-
Create a new filter. Click for the INVOICEITEMS table.
-
Select Filter Column. Choose Product Group as the filter column.
-
Complete the filter condition by specifying the filter values. This will open a new windown listing all elligible values.
-
Identify the most relevant values by clicking .
-
Create automatically meaningful groups of values by clicking . Select one group if any is relevant.
-
Finalize your value selection.
-
Generate filter name and relevance.
-
Check the relevance of the new filter.
-
Save the new filter by clicking .
-
Click to proceed.
Step 7: Review Feature Ideation Setup¶
-
Review the suggested setup.
-
Go to 'User Defined Function' section and click to use the "embedding" UDF to transform the Product Group column of the PRODUCTGROUP table into embeddings.
-
Click to complete Feature Ideation.
Step 8: Review the Feature Ideation Report¶
After the process completes, a table of ideated features will be displayed for your review.
Accessing the Detailed Report: To view the full report describing each step of the ideation process, click next to the Ideation run name "Semi-Automated Mode".
To visualize the full report with an indexed view in a new tab, click .
Step 9: Run EDA¶
Select All Ideated Features: Click to select all ideated features.
Initiate EDA: Scroll to the bottom of the ideated features table and click to begin the Exploratory Data Anaylsis (EDA) process.
Step 10: Run Feature Selection¶
- Start Feature Selection by clicking on the Magic Ward .
- Select the SHAP-Based mode and choose the option to exclude Low Added Value Features.
Once the selection is complete, review the selected features.
Step 11: Add Features to the Feature Catalog¶
- Clear the search (if you used it) and any prior selection (if any) by clicking
- Select features in the feature list by clicking .
-
Save the selected features into the Feature Catalog by clicking .
Step 12: Refine Selection¶
We will refine our prior selection by using GenAI.
- Start Feature Selection by clicking on the Magic Ward .
- Select the GenAI-Based mode and set target count to 20.
Once the selection is complete, review the selected features.
Step 13: Add GenAI selection to the Feature Catalog¶
- Clear the search (if you used it) and any prior selection (if any) by clicking
- Select features in the feature list by clicking .
- Save the selected features into the Feature Catalog by clicking .
Step 14: Run Rule-based Selection¶
- Change to All features by setting the dropdown list to next to the Magic Ward.
- Start Feature Selection by clicking on the Magic Ward .
- Select the Rule-Based mode. In this example, we want the top feature for each theme if it is part of top 100 features overall.
Once the selection is complete, review the selected features.
Step 15: Add Rule-based selection to the Feature Catalog¶
- Clear the search (if you used it) and any prior selection (if any) by clicking
- Select features in the feature list by clicking .
- Save the selected features into the Feature Catalog by clicking .
Step 16: Manage Feature Selections¶
Easily manage your feature selections to filter and refine ideated features.
Filtering Ideated Features¶
You can use any existing selection to filter the ideated features.
Reviewing Prior Selections¶
To review your past selections, navigate to the Feature Selection tab.
Click on a selection to access its details. Each selection provides information across three tabs:
-
About Tab: Displays a description and a summary of the signal range for the selected features.
-
Settings Tab: Shows detailed information about how the selection was generated, including parameters and logic used.
-
Features Tab: Shows selected features together with their semantic relevance.
Step 17: Download the List of Ideated Features Metadata¶
Follow these steps to download a CSV file containing metadata for all ideated features (that we will use later for modeling):
- Clear the search (if you used it) and any prior selection (if any) by clicking
- Select .
- Download the csv file by clicking
- Choose the "filtered features" option and give a name to your file (e.g., "Ideated Features").