FeatureByte: A Comprehensive Workflow for Feature Engineering¶
FeatureByte streamlines the end-to-end feature engineering process, from feature creation to deployment and lifecycle management.
Ready to get started? Explore each step and its corresponding tutorials to unlock the full potential of FeatureByte.
Step 1: Define Your Data Model¶
- Create a Catalog: Organize your data into a clear and accessible structure.
- Register Tables: Classify tables as event, item, time series, dimension, or slowly changing dimension.
- Define entities: Identify key entities within your data and tag relevant columns.
- Enhance Data Quality: Set default cleaning operations to ensure data accuracy.
Tutorials
-
Credit Default UI Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions and Tag Semantics, Set Default Cleaning Operations.
-
Credit Default SDK Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions, Set Default Cleaning Operations.
-
Grocery UI Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions and Tag Semantics, Set Default Cleaning Operations.
-
Grocery SDK Tutorials: Create Catalog, Register Tables, Register Entities, Update Descriptions, Set Default Cleaning Operations.
Step 2: Formulate Your Use Case¶
- Identify the Primary Entity: Determine the central focus of your use case.
- Define the Target: Specify the outcome you aim to predict or classify.
- Establish the Context: Define the scope and circumstances in which features are expected to be served.
- Create Observation Sets: Define the specific data points to be used for modeling.
Tutorials
-
Credit Default UI Tutorials: Formulate Use Case, Create Observation Tables.
-
Credit Default SDK Tutorials: Formulate Use Case, Create Observation Tables.
-
Grocery UI Tutorials: Formulate Use Case, Create Observation Tables.
-
Grocery SDK Tutorials: Formulate Use Case, Create Observation Tables.
Step 3: Create Features¶
Automated Feature Creation:¶
- Use Feature Ideation to generate relevant feature suggestions.
- Evaluate features using EDA and select the most promising ones with various feature selection strategies.
Tutorials
-
Credit Default UI Tutorials: Ideate Features, Refine Ideation
-
Grocery UI Tutorials: Ideate Features, Refine Ideation
Manual Feature Creation:¶
- Leverage the Python SDK to create custom features.
Tutorials
-
Credit Default SDK Tutorials: Create Lookup features, Create Window Aggregates from Event Table, Create Features from SCD, Create Calendar Window Aggregates from Time Series,
-
Grocery SDK Tutorials: Create lookup feature, Create window aggregate features, Derive features from other features, Derive similarity features from bucketing, Use embeddings,
Step 4: Experiment and Iterate¶
- Build feature lists: Combine selected features into cohesive sets using Feature Ideation, the Feature List Builder, or the SDK.
- Generate Historical Feature Data: Prepare historical data for model training and evaluation.
- Train and Test Models: Iterate on model development and hyperparameter tuning.
- Share Promising features: Collaborate with team members and share valuable insights.
Tutorials
-
Credit Default UI Tutorials: Create New Feature List, Compute Feature Table.
-
Credit Default SDK Tutorials: Create Feature Lists, Compute Historical Feature Values.
-
Grocery UI Tutorials: Create New Feature List, Compute Feature Table.
-
Grocery SDK Tutorials: Create Feature Lists, Compute Historical Feature Values.
Step 5: Deploy and Serve Features¶
- Mark features as production-ready: Certify features for deployment.
- Deploy Features and Enable Serving: Make features available for real-time or batch inference.
In catalogs with Approval Flow enabled, deploying features involves:
- Verifying feature compliance with default cleaning operations and feature job settings.
- Checking the status of source tables.
- Backtesting feature job settings to ensure no future training-serving inconsistencies.
- Sharing the feature definition file for review and approval.
This comprehensive process ensures governance and reduces the risk of errors in production.
Tutorials
-
Credit Default UI Tutorials: Deploy and Serve
-
Credit Default SDK Tutorials: Deploy and Serve
-
Grocery UI Tutorials: Deploy and Serve
-
Grocery SDK Tutorials: Deploy and Serve
Step 6: Manage the Feature Life Cycle¶
- Update Feature Job Settings and Cleaning Operations: Adjust configurations when data quality or availability changes.
- Create New Feature Versions: Produce updated versions reflecting the latest settings and mark them as default.
- Monitor Feature Job Status: Regularly review feature job status to track performance and ensure smooth operations.
In catalogs with Approval Flow enabled, changes in table metadata trigger a review process. This ensures:
- New versions of features and feature lists address any data-related issues.
- New deployments use the latest, approved configurations.
- Clear documentation of all changes for compliance and reproducibility.
Tutorials
- Grocery UI Tutorials: Manage Life Cycle
- SDK Tutorials: Manage Life Cycle