8. Ideate Features
FeatureByte provides two primary methods for feature creation:
- Manual Creation: Utilize the SDK declarative framework for complete control over feature engineering.
- Automatic Creation: Accelerate feature engineering through Feature Ideation an AI-powered, automated approach.
How FeatureByte Ideates Features
Feature Ideation uses an agentic approach to tailor features to your use case. It can:
- Analyze tables and relationships to identify relevant data.
- Fill in missing semantic tags using column metadata.
- Suggest useful column transformations, such as time deltas, ratios, and differences.
- Recommend key filters to isolate important events.
- Highlight key columns for advanced feature engineering.
- Propose relevant time windows for feature aggregation.
- Analyze event frequencies to identify timing signals.
- Recommend features and evaluate their semantic relevance to your use case.
- Ensure feature reuse by detecting and avoiding duplicate features in the Catalog.
Each step is documented, ensuring complete transparency and traceability throughout the process
Modes of Operation
Feature Ideation offers two modes of operation:
- Fully automated mode: Automatically executes the entire workflow.
- Semi-automated mode: Enables you to review and refine recommendations step by step.
After ideating features, leverage exploratory data analysis (EDA) and advanced feature selection techniques to finalize your feature set.
This tutorial demonstrates the fully automated mode of Feature Ideation. A detailed guide to the semi-automated mode will follow in the next section.
Note
If you want to learn how to manually create features, please consult our SDK tutorials.
Step 1: Select Your Use Case¶
-
Navigate to Feature Ideation from the 'Ideate' section of the menu.
-
Select the use case: "Loan Default by client".
Step 2: Initiate a New Feature Ideation Workflow¶
-
Click
to start the Feature Ideation process.
-
Edit the Feature Ideation name and description by clicking
.
Step 3: Start Automated Mode¶
- Begin the automated Feature Ideation workflow by clicking
.
-
Optionally, you can stop the process at any stage if needed.
Once the process is initiated, you’ll see confirmation that the run has started:
Step 4: Review the Feature Ideation Report¶
After the process completes, a table of ideated features will be displayed for your review.
-
Access the Detailed Report, describing each step of the ideation process, by clicking
next to the Ideation name "Automated Mode".
-
Visualize the full report with an indexed view in a new tab, by clicking
.
Step 5: Run EDA¶
-
Select All Ideated Features by clicking
.
-
Initiate EDA. Scroll to the bottom of the ideated features table and click
to begin the Exploratory Data Anaylsis (EDA) process.
-
Sort by Predictive Score. After EDA completes, each feature will have a univariate Predictive Score. Sort the features by predictive score to identify those with the highest predictive power.
Step 6: View Detailed EDA for a Feature¶
To explore the EDA results for a specific feature:
-
Click
to filter the list of features.
-
Select the filter criteria: In our example, "PRIOR_APPLICATIONS" for the Primary Table and "most_frequent" for the Signal Type.
-
Click on the feature name to open its details (here: 'CLIENT_Prior_Application_PRODUCT_COMBINATION_with_Highest_sum_of_Prior_Applications_AMT_CREDITs_104w'), then navigate to the 'EDA' tab.
-
Interact with EDA Plots. Within the 'EDA'' tab, click on the plot to activate tooltips for additional insights.
-
Clear Filters and Close the Panel by clicking
to remove the filter and
to close the filter panel in
.
Step 7: Run Feature Selection¶
- Start Feature Selection by clicking on the Magic Wand
.
-
Select the SHAP-Based mode.
Screen candidates
-
Set the pre-filtering options:
- No. of Top Features Overall: 2000
- No. of Top Features per Theme: None
-
Choose the options to:
- exclude Low Added Value Features
- but keep Dictionary and Vector Features.
Obtain a Small but Predictive Feature Selection
-
Number of l1 rounds = 1:
- The round will train a LGBM and applies L1 regularization on its SHAP values
- This eliminates features with minimal contribution or high collinearity.
-
Number of importance rounds to 1:
- The round trains a LGBM and keep only top-performing features that meet the SHAP importance threshold of 0.95.
-
-
Once the selection is complete, ensure that the new feature selection is properly selected..
-
For more details, navigate to the 'Feature Selection' tab and click on the selection.
Step 8: Review Features¶
Review Individual Features¶
-
Click on a feature to open its details. You can use the filter or the search, to find a specific feature.
-
Check Semantic Relevance in the 'About' tab of the feature.
-
Explore Feature Lineage by switching to the 'Lineage' tab. Click
to trace the feature's origin and transformations.
-
Analyze Feature Distribution and its relationship with the Target in the 'EDA' tab.
Review SDK Code for Feature Selection¶
-
Clear the filter or search (if you used it).
-
Ensure the feature selection is properly selected.
-
Select features in the feature list by clicking
.
-
Download the notebook to inspect the feature declaration code by clicking
.
-
Open the notebook to review the feature engineering code.
Step 9: Add Features to the Feature Catalog¶
-
Follow the same steps as in Review SDK Code for Feature Selection.
-
Instead of clicking
, click
.
-
Call the feature list "SHAP selection from Automated Mode".
After completion, the selected features are added to the catalog, and their readiness status updates to 'DRAFT' instead of 'NEW'.