Overview
FeatureByte's SDK Tutorials¶
Welcome aboard! You're about to embark on a learning journey with FeatureByte's SDK. Whether you're a beginner eager to get started or an expert looking to delve deeper, these tutorials have something for everyone. Step by step, we'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.
For a holistic view of FeatureByte's open-source platform, along with insights into the overall workflow and the intricacies of the SDK, please head over to our documentation main page and explore the workflow and SDK overview sections.
Dataset Overview¶
The dataset for our main tutorial is the 'French grocery dataset'. The dataset contains four tables containing data from a chain of grocery stores:
-
GroceryCustomer: Customer details, including their name, address, and date of birth.
-
GroceryInvoice: Grocery invoice details, containing the timestamp and the total amount of the invoice.
-
InvoiceItems: The grocery item details within each invoice, including the quantity, total cost, discount applied, and product ID.
-
GroceryProduct: The product group description for each grocery product.
It can potentially be used for a number of prediction use cases, we are going to consider predicting amount active customers will spend in next 2 weeks.
Note
To simulate a production environment, the data resides in the tutorial's data warehouse, where it's dynamically updated with new records every hour.
Getting Started¶
-
For Practitioners: If you aim to run the notebooks and immerse yourself in the end-to-end workflow, please follow the instructions for the tutorials installation and execute each notebook in sequence.
-
For Readers: If you're here just to read and understand, feel free to jump to any section of your interest.
End-to-End Workflow¶
Define the Data Model of the catalog¶
4. Update descriptions to tables (optional)
5. Set Default Cleaning Operations
Formulate your use case¶
Create features¶
9. Create window aggregate features
10. Derive features from other features
11. Derive similarity features from bucketing
12. Use embeddings
Compute training data for your use case¶
14. Compute historical feature values
Deploy and manage your features¶
15. Deploy and serve a feature list
Download the tutorials here¶
Download all the Grocery Notebooks Tutorial notebooks here