Skip to content

Overview

FeatureByte's SDK Tutorials

Welcome aboard! You're about to embark on a learning journey with FeatureByte's SDK. Whether you're a beginner eager to get started or an expert looking to delve deeper, these tutorials have something for everyone. Step by step, we'll guide you through creating a catalog, registering its data model, formulating your use case, crafting features, computing training data, and deploying and managing those features.

For a holistic view of FeatureByte's open-source platform, along with insights into the overall workflow and the intricacies of the SDK, please head over to our documentation main page and explore the workflow and SDK overview sections.

Dataset Overview

The dataset for our main tutorial is the 'French grocery dataset'. The dataset contains four tables containing data from a chain of grocery stores:

  • GroceryCustomer: Customer details, including their name, address, and date of birth.

  • GroceryInvoice: Grocery invoice details, containing the timestamp and the total amount of the invoice.

  • InvoiceItems: The grocery item details within each invoice, including the quantity, total cost, discount applied, and product ID.

  • GroceryProduct: The product group description for each grocery product.

It can potentially be used for a number of prediction use cases, we are going to consider predicting amount active customers will spend in next 2 weeks.

French grocery dataset

Note

To simulate a production environment, the data resides in the tutorial's data warehouse, where it's dynamically updated with new records every hour.

Getting Started

  • For Practitioners: If you aim to run the notebooks and immerse yourself in the end-to-end workflow, please follow the instructions for the tutorials installation and execute each notebook in sequence.

  • For Readers: If you're here just to read and understand, feel free to jump to any section of your interest.

End-to-End Workflow

1. Create catalog

Define the Data Model of the catalog

2. Register tables

3. Register entities

4. Update descriptions to tables (optional)

5. Set Default Cleaning Operations

Formulate your use case

6. Formulate Use Case

7. Create Observation Tables

Create features

8. Create lookup feature

9. Create window aggregate features

10. Derive features from other features

11. Derive similarity features from bucketing

12. Use embeddings

Compute training data for your use case

13. Create feature list

14. Compute historical feature values

Deploy and manage your features

15. Deploy and serve a feature list

16. Manage feature life cycle

Download the tutorials here

Download all the Grocery Notebooks Tutorial notebooks here