FeatureByte: A Comprehensive Workflow for Feature Engineering¶

FeatureByte streamlines the end-to-end feature engineering process, from feature creation to deployment and lifecycle management.

Workflow Diagram

Ready to get started? Explore each step and its corresponding tutorials to unlock the full potential of FeatureByte.

Step 1: Define Your Data Model¶

Create a Catalog: Organize your data into a clear and accessible structure.
Register Tables: Classify tables as event, item, time series, dimension, or slowly changing dimension.
Define entities: Identify key entities within your data and tag relevant columns.
Enhance Data Quality: Set default cleaning operations to ensure data accuracy.

Tutorials

Credit Default UI Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions and Tag Semantics, Set Default Cleaning Operations.
Credit Default SDK Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions, Set Default Cleaning Operations.
Grocery UI Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions and Tag Semantics, Set Default Cleaning Operations.
Grocery SDK Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions, Set Default Cleaning Operations.

Identify the Primary Entity: Determine the central focus of your use case.
Define the Target: Specify the outcome you aim to predict or classify.
Establish the Context: Define the scope and circumstances in which features are expected to be served.
Create Observation Sets: Define the specific data points to be used for modeling.

Tutorials

Credit Default UI Tutorials: Formulate Use Case, Create Observation Tables.
Credit Default SDK Tutorials: Formulate Use Case, Create Observation Tables.
Grocery UI Tutorials: Formulate Use Case, Create Observation Tables.
Grocery SDK Tutorials: Formulate Use Case, Create Observation Tables.

Use Feature Ideation to generate relevant feature suggestions.
Evaluate features using EDA and select the most promising ones with various feature selection strategies.

Tutorials

Tutorials

Credit Default SDK Tutorials: Create Lookup features, Create Window Aggregates from Event Table, Create Features from SCD, Create Calendar Window Aggregates from Time Series,
Grocery SDK Tutorials: Create lookup feature, Create window aggregate features, Derive features from other features, Derive similarity features from bucketing, Use embeddings,

Build feature lists: Combine selected features into cohesive sets using Feature Ideation, the Feature List Builder, or the SDK.
Generate Historical Feature Data: Prepare historical data for model training and evaluation.
Train and Test Models: Iterate on model development and hyperparameter tuning.
Share Promising features: Collaborate with team members and share valuable insights.

Tutorials

Credit Default UI Tutorials: Create New Feature List, Compute Feature Table.
Credit Default SDK Tutorials: Create Feature Lists, Compute Historical Feature Values.
Grocery UI Tutorials: Create New Feature List, Compute Feature Table.
Grocery SDK Tutorials: Create Feature Lists, Compute Historical Feature Values.

Mark features as production-ready: Certify features for deployment.
Deploy Features and Enable Serving: Make features available for real-time or batch inference.

In catalogs with Approval Flow enabled, deploying features involves:

Verifying feature compliance with default cleaning operations and feature job settings.
Checking the status of source tables.
Backtesting feature job settings to ensure no future training-serving inconsistencies.
Sharing the feature definition file for review and approval.

This comprehensive process ensures governance and reduces the risk of errors in production.

Tutorials

Update Feature Job Settings and Cleaning Operations: Adjust configurations when data quality or availability changes.
Create New Feature Versions: Produce updated versions reflecting the latest settings and mark them as default.
Monitor Feature Job Status: Regularly review feature job status to track performance and ensure smooth operations.

In catalogs with Approval Flow enabled, changes in table metadata trigger a review process. This ensures:

Tutorials