Skip to content

9. Create New Feature Lists and Models

In the previous tutorials, Ideation created two feature lists that are now available in the Catalog.

In this tutorial, you will learn several ways to create new Feature Lists:

The latter two methods support Key-Based Feature extraction from dictionary-style features, which can improve interpretability when working with nested or high-dimensional structures.

You’ll also learn how to train new models using these feature lists, or materialize features to train models outside FeatureByte.


Step 1: Review Existing Feature Lists

  1. Navigate to the Feature Lists Catalog under the Experiment section of the menu. Confirm that the two feature lists Ideation previously created are listed. Catalog view


  2. Click on the feature list suggested during ideation. SHAP selection


  3. Go to the Features tab to review the features in the list. Feature tab


  4. Open the Themes tab to identify signal types missing from the feature list. Once reviewed, close the window. Themes tab


Step 2: Create a New Feature List Using the Feature List Builder

  1. Add the feature list suggested by ideation to the Feature List Builder by clicking Plus Button. Add to builder


  2. Review the Builder’s suggestions by clicking Theme Suggestions in the Suggestion Section section at the bottom of the Feature List Builder. Builder suggestions


  3. Click on Show Features for the CLIENT/BUREAU/MOST FREQUENT theme to explore associated features. Review EDA results and add the feature using the Plus Button. Added feature


  4. Review the feature list by clicking Review Button and save it using Save List Button. Name it “1 + Ideated Features.” Save feature list


  5. Verify the new feature list appears in the Catalog. Updated catalog


Step 3: Use the Regularized SHAP-based Feature List Simplification

This technique produces a simple and interpretable Feature List through a two-step process:

  • Template Model Training Train an XGBoost or LightGBM template model on nested training data to generate SHAP values.
  • Regularized Linear Model Training Train a regularized linear model on nested validation data using the SHAP values from Step 1 as inputs. The regularization encourages sparsity, naturally reducing the feature set.

  1. Click Simplification Button for the feature list suggested by ideation.

    Simplify list


  2. Select Applications up to Sept 2024 as training table and the NCTsDE_LGB_classification as the model template. Then, click Simplification Create Button.

    Simplify list


  3. Once the task finishes (this may take some time), verify the new Feature List appears in the Catalog.

    Updated catalog


  4. Click on the Feature List to access its details. Search for "when" to list Key-Based Features.

    Updated catalog


Step 4: Create a New Feature List from a Model

  1. Navigate to Leaderboard under the Experiment menu and configure the following:

    • Observation Table: Applications Q4 2024
    • Type: Validation
    • Metric: AUC

    Leaderboard configuration


  2. Click on the best-performing model and open its Feature Importance tab. Feature importance


  3. Select the Per Feature Key Panel. It may list close to 900 Feature Keys! Click New Feature List Button. Set the Importance Threshold Percentage to 0.90. This will generate new features derived from dictionary features and create a feature list composed of the top features and keys for the model. Feature list from model


  4. Return to the Feature Lists Catalog to verify the new list appears. This should have led to more features than with the Regularized SHAP-based Feature List Simplification. Catalog confirmation


Step 5: Train New Models with the new Feature Lists

  1. Click New Model Button to train a new model with the simplified feature List. New model catalog


    Configure your model as follows:

    • Metric: area_under_curve
    • Training Observation Table: Applications up to Sept 2024
    • Validation Observation Table: Applications Q4 2024
    • Model Template: NCTsDE_LGB_classification

    You can review and edit parameters by clicking on them. Configure model


  2. Click Run in Background Button. Do the same to train a new model with the feature List derived from the feature key importance.

    New model catalog


  3. Navigate to Tasks under the Manage menu to track model training progress. Model training progress


  4. Once training completes, verify the new model appears on the leaderboard. Leaderboard with new model


Step 6: (Optional) Compute a Feature Table

If you want to train a model outside FeatureByte, you can compute a Feature Table using the same feature list.

  1. Return to the Feature List Catalog.

  2. Click Compute Icon. Compute feature table


  3. Select the Observation Table: Applications with Credit Default target and confirm by clicking Compute Button. Confirm compute


  4. Once materialization completes, navigate to the Feature Tables Catalog under Experiment to confirm creation. Feature table catalog