Overview

FeatureByte's SDK Tutorials¶

Welcome aboard! You're about to embark on a learning journey with FeatureByte's SDK. Whether you're a beginner eager to get started or an expert looking to delve deeper, these tutorials have something for everyone. Step by step, we'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.

For a holistic view of FeatureByte's open-source platform, along with insights into the overall workflow and the intricacies of the SDK, please head over to our documentation main page and explore the workflow and SDK overview sections.

Dataset Overview¶

The dataset for our main tutorial is the 'French grocery dataset'. The dataset contains four tables containing data from a chain of grocery stores:

GroceryCustomer: Customer details, including their name, address, and date of birth.
GroceryInvoice: Grocery invoice details, containing the timestamp and the total amount of the invoice.
InvoiceItems: The grocery item details within each invoice, including the quantity, total cost, discount applied, and product ID.
GroceryProduct: The product group description for each grocery product.

It can potentially be used for a number of prediction use cases, we are going to consider predicting amount active customers will spend in next 2 weeks.

French grocery dataset

Note

To simulate a production environment, the data resides in the tutorial's data warehouse, where it's dynamically updated with new records every hour.

Getting Started¶

For Practitioners: If you aim to run the notebooks and immerse yourself in the end-to-end workflow, please follow the instructions for the tutorials installation and execute each notebook in sequence.
For Readers: If you're here just to read and understand, feel free to jump to any section of your interest.

And if you're itching to explore more, after completing up to step 7 in the end-to-end workflow, dive deep into specific feature notebooks to witness the full power of FeatureByte's feature engineering capabilities.

End-to-End Workflow¶

Create catalog

Define the Data Model of the catalog¶
Register tables
Register entities
Update descriptions to tables (optional)
Set Default Cleaning Operations

Formulate your use case¶
Formulate Use Case
Create Observation Tables

Create features¶
Create lookup feature
Create window aggregate features
Derive features from other features
Derive similarity features from bucketing
Use embeddings

Compute training data for your use case¶
Create feature list
Compute historical feature values

Deploy and manage your features¶
Deploy and serve a feature list
Manage feature life cycle

Expand Your Horizons¶

Want more? Learn by example! Here are some additional feature examples tailored for various entities:

If you are interested in integrating your own transformer models for text processing or other transformations within the FeatureByte SDK. This is done by registering a User Defined Function (UDF).

For step-by-step guidance on creating a SQL Embedding UDF, visit the Bring Your Own Transformer tutorial.

Download Tutorials¶

Download all the end-to-end workflow notebooks here.

Overview

FeatureByte's SDK Tutorials¶

Dataset Overview¶

Getting Started¶

End-to-End Workflow¶

Define the Data Model of the catalog¶

Formulate your use case¶

Create features¶

Compute training data for your use case¶

Deploy and manage your features¶

Expand Your Horizons¶

Download Tutorials¶