Overview
FeatureByte's SDK Tutorials¶
Welcome aboard! You're about to embark on a learning journey with FeatureByte's SDK. Whether you're a beginner eager to get started or an expert looking to delve deeper, these tutorials have something for everyone. Step by step, we'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.
For a holistic view of FeatureByte's open-source platform, along with insights into the overall workflow and the intricacies of the SDK, please head over to our documentation main page and explore the workflow and SDK overview sections.
Dataset Overview¶
The dataset for our main tutorial is the 'French grocery dataset'. The dataset contains four tables containing data from a chain of grocery stores:
-
GroceryCustomer: Customer details, including their name, address, and date of birth.
-
GroceryInvoice: Grocery invoice details, containing the timestamp and the total amount of the invoice.
-
InvoiceItems: The grocery item details within each invoice, including the quantity, total cost, discount applied, and product ID.
-
GroceryProduct: The product group description for each grocery product.
It can potentially be used for a number of prediction use cases, we are going to consider predicting amount active customers will spend in next 2 weeks.
Note
To simulate a production environment, the data resides in the tutorial's data warehouse, where it's dynamically updated with new records every hour.
Getting Started¶
-
For Practitioners: If you aim to run the notebooks and immerse yourself in the end-to-end workflow, please follow the instructions for the tutorials installation and execute each notebook in sequence.
-
For Readers: If you're here just to read and understand, feel free to jump to any section of your interest.
And if you're itching to explore more, after completing up to step 7 in the end-to-end workflow, dive deep into specific feature notebooks to witness the full power of FeatureByte's feature engineering capabilities.
End-to-End Workflow¶
-
Define the Data Model of the catalog¶
- Register entities
- Add descriptions to tables (optional)
-
Set Default Cleaning Operations
Formulate your use case¶
-
Create features¶
- Create window aggregate features
- Derive features from other features
-
Derive similarity features from bucketing
Compute training data for your use case¶
-
Compute historical feature values
Deploy and manage your features¶
- Manage feature life cycle
Expand Your Horizons¶
Want more? Learn by example! Here are some additional feature examples tailored for various entities:
- the Customer entity
- the Product entity
- the Customer x Product interaction
- the Invoice entity
- the Item entity
If you are interested in integrating your own transformer models for text processing or other transformations within the FeatureByte SDK. This is done by registering a User Defined Function (UDF).
For step-by-step guidance on creating a SQL Embedding UDF, visit the Bring Your Own Transformer tutorial.
Download Tutorials¶
Download all the end-to-end workflow notebooks here.