Skip to content

8b. Refine Ideation

In the previous step of the tutorials, we ran a simple Feature Ideation, where the system independently generated a set of features for a single aggregation window.

Now, we focus on refining the ideation by exploring more features for the columns used by this feature set.

In this tutorial, you will learn how to:

  • Prune uninformative columns based on the prior feature selection
  • Enable filter detection
  • Review and adjust the system's suggestions for transforms and filters.

Note

If you want to learn how to incorporate UDFs, such as embedding, to enrich feature engineering, checkout out the Grocery Dataset UI Tutorials.


Step 1: Create New Feature Ideation

  1. Navigate to Feature Ideation under the 'Ideate' section of the menu.

  2. Click New Ideation Button to start a new ideation process.

    Name

  3. Edit the Feature Ideation name and description by clicking Edit Button. Name

  4. Click Auto Run Button and configure the ideation by clicking Config Button. Name

  5. In Semantic Detection Setup, select the prior run feature selection to prune uninformative columns Name

  6. In Filter Setup, uncheck the 'Skip Filters' option Name

  7. Click Config Button to start ideation.

Step 2: Review Feature Ideation Steps

Review Table Selection

Name

Click Name to view detailed table selection analysis.

Name

Review the Column Semantics Detection

Column semantics are organized by table. Select the tab corresponding to the table you are interested in. Name

Which Semantic Type Should You Focus On?

  • Event Tables & Time Series Tables: Focus on event_type and event_status.
  • Slowly Changing Dimension Tables: Focus on termination_timestamp / termination_date.
  • All Tables: Review non_additive_numeric, non_informative, not_to_use, ambiguous_numeric, and ambiguous_categorical semantic types.

Review the Transforms Detection.

Transforms are organized by table. Select the tab corresponding to the table you are interested in. Name

Click Name for additional insight.

Name

Review suggested Filters Detection.

Filters are organized by table. Select the tab corresponding to the table you are interested in. Name

Click Name for additional insight.

Name

Review Feature Ideation Setup

  1. Review Aggregation windows
  2. Review Event Frequency type
  3. Review Key Aggregation Column Name

Review Ideated Features

Sort by Predictive Score if you are interested in the feature with the strongest correlation with the target Name

Review EDA

Name

Review Feature Selection

Click on the selection to access its details:

  • summary of the signal range for the selected features.
  • information about how the selection was generated.

Name


Step 2: Add Features to the Feature Catalog

  1. Go to the Features tab of the Feature Selection step.

  2. Clear the search (if you used it) and any prior selection (if any) by clicking Clear Button

  3. Clear Filters by clicking Clear Filter Button and close the filter panel by clicking Close Filter Button.

  4. Ensure the feature selection is properly selected. Name

  5. Select features in the feature list by clicking Select All Button. Name

  6. Save the selected features into the Feature Catalog by clicking Save Feature List.

  7. Call the feature list "SHAP selection after Column Pruning". Name

Once the process is complete, the added features should be marked as 'DRAFT'. Name


Step 3: Clone and reset Feature Ideation

Clone and reset the Feature Ideation to adjust the system's suggestions for column semantics, transforms and filters.

  1. Go to the Column Semantics step and click Clone Button. Name
  2. Edit name and description of the new ideation. Name

Step 4: Adjust Column Semantics

Ensure AMT_GOODS_PRICE Semantic Type in NEW_APPLICATION table is marked as 'non_negative_amount'.

  1. Navigate to the NEW_APPLICATION tab. Name

  2. Adjust semantic tag of AMT_GOODS_PRICE under numeric/additive_numeric/non_negative_amount in FeatureByte's ontology). Name Name

  3. Click Next Step Button to proceed to Transform. Name


Step 5: Adjust Transforms

  1. Navigate to INSTALLMENTS_PAYMENTS tab. Name

  2. Open the Transforms window by clicking Name. Name

  3. Click Name to create time delta between actual_installment_date and scheduled_installment_date.

    • Select the Time delta operation Name
    • Choose the Day unit. Name
    • Set the actual_installment_date column as the Operand 1 and the scheduled_installment_date column as the Operand 2. Both are 'Column' Operand type. Name Name
    • Generate a name and relevance by clicking Name. Name
    • Review the relevance explanation and edit suggested name if needed. Name
  4. Click Name and check the new transform is saved. Name

Click Next Step Button to proceed to Filters.


Step 6: Adjust Filters

We will delete filters and create a new one in the PREVIOUS_APPLICATION table.

We will delete:

  • Filter 2: Approved Contract_status Consumer loans Contract_type Priorapplication
  • Filter 3: Cash loans Contract_type Priorapplication
  • Filter 4: Consumer loans Contract_type Priorapplication

and create two additional filters:

  • CONTRACT_STATUS == Refused
  • YIELD_GROUP == high

Delete filters

  1. Navigate to the PREVIOUS_APPLICATION tab Name

  2. open the Filters window by clicking Name. Name

  3. Click Name to delete the 3 filters listed above.

    Name

Create filter with CONTRACT_STATUS == Refused

  1. Click Name to add a new filter.

    Name

  2. Select Filter Column and choose CONTRACT_STATUS as the filter column. Name

  3. Complete the filter condition by specifying the filter values. This will open a new windown listing all elligible values. Name Name

  4. Identify the most relevant values by clicking Name and finalize your value selection by selecting 'Refused'. Name

  5. Generate filter name and relevance. Name

  6. Check the relevance of the new filter. Name

  7. Change the filter type to secondary filter. This will decide the complexity of features that will be ideated for this filter. Name

Create filter with YIELD_GROUP == high

  1. Click Name to add a new filter. Select Filter Column and choose YIELD_GROUP as the filter column. Name

  2. Identify the most relevant values by clicking Name and finalize your value selection by selecting 'high'. Name

  3. Generate filter name and relevance. Check the relevance of the new filter. Change the filter type to secondary filter. Name

Review the final Filter list and proceed to Feature Selection

  1. Click Name Name

  2. Click Auto Run Button and then Start Button to proceed up to Feature Selection. Name

  3. Once complete, review suggested Feature Selection. Name


Step 7: Run new SHAP Feature Selection

To further reduce the number of features, run a new feature selection with an additional round of feature importance based on SHAP values.

  1. Go to the Features tab of the Feature Selection step.

  2. Select All Features. Name

  3. Start Feature Selection by clicking on the Magic Wand Magic Wand.

  4. Select the SHAP-Based mode, select filtered features and increase the number of importance rounds to 2. Name

  5. Once the selection is complete, review the selected features. Name


Step 8: Add New SHAP Feature Selection to the Feature Catalog

  1. Clear the search (if you used it) and any prior selection (if any) by clicking Clear Button
  2. Select features in the feature list by clicking Select All Button. Name
  3. Save the selected features into the Feature Catalog by clicking Save Feature List. Name

  4. Call the feature list "SHAP selection (1+2) after adjusted ideation". Name


Step 9: Add Original SHAP Feature Selection to the Feature Catalog

  1. Clear the search (if you used it) and any prior selection (if any) by clicking Clear Button
  2. Select original SHAP Feature Selection and click Select All Button. Name
  3. Save the selected features into the Feature Catalog by clicking Save Feature List.

  4. Call the feature list "Suggested SHAP selection (1+1) after adjusted ideation". Name


Step 10: Run Rule-Based Feature Selection

To ensure that the feature catalog covers all themes available, we will run a Rule-Based Selection.

  1. Select All Features and initiate new selection by clicking on the Magic Wand Magic Wand. Name

  2. Select the Rule-Based mode, and select top 1 feature for each theme. Name

  3. Once the selection is complete, review the selected features. Name

  4. In the Features tab, click Select All Button. Click Save Feature List. Call the feature list "Top feature per theme". Name