Claude¶
The FeatureByte Skills for Claude package equips Claude with expert knowledge of the FeatureByte SDK and REST API. Once installed, you can drive an end-to-end machine learning workflow, from warehouse setup through model training, by simply describing what you want in natural language.
Access
The FeatureByte Skills for Claude package is available on request. Contact your FeatureByte representative to obtain access.
What It Does¶
The skill gives Claude deep knowledge of the FeatureByte SDK (v3.4), REST API, and full ML workflows. Typing /featurebyte in Claude Code activates the main skill, which automatically routes to specialized sub-skills as needed:
| Sub-skill | Purpose |
|---|---|
/featurebyte:setup |
Warehouse connection and table registration |
/featurebyte:explore |
Catalog exploration |
/featurebyte:eda |
EDA and data cleaning |
/featurebyte:features |
Feature engineering |
/featurebyte:ml |
Feature ideation and model training pipelines |
/featurebyte:forecast |
Time series forecasting |
/featurebyte:deploy |
Deployment and serving |
Prerequisites¶
- Python 3.12+
- uv package manager
- A running FeatureByte instance (local or cloud)
- Claude Code CLI or IDE extension
Installation¶
Install Claude Code using Anthropic's recommended native installer (auto-updates in the background):
Alternatives: Homebrew (brew install --cask claude-code), npm (npm install -g @anthropic-ai/claude-code), or the desktop app. See the official setup guide for all options.
macOS: install everything with Homebrew
If you prefer Homebrew, you can install Claude Code and uv together:
Note that Homebrew installs do not auto-update; run brew upgrade claude-code periodically.
Then clone the skills repository and install the FeatureByte SDK into a virtual environment so the Python code Claude generates can actually run:
git clone https://github.com/featurebyte/claude-skills.git
cd claude-skills
# Create a virtual environment
uv venv
# Activate it so `python` and `featurebyte` are on PATH for Claude
source .venv/bin/activate
# Install the FeatureByte SDK and other dependencies into the venv
uv sync
# Start Claude Code from the same shell
claude
Windows
On Windows, activate the venv with .venv\Scripts\activate (CMD) or .venv\Scripts\Activate.ps1 (PowerShell).
On subsequent sessions you don't need to re-clone or re-run uv sync. Just cd into the repo, reactivate the venv, and start Claude:
Claude Code automatically picks up the skills from .claude/skills/, and the /featurebyte command is available immediately.
End-to-End Example: Department Store Sales Forecast¶
The transcript below is condensed from a real working session. It shows how a data scientist built a 28-day department-store sales forecast on the M5 dataset, purely through conversation.
1. Connect and Create a Catalog¶
The staging profile is active and connected to https://staging.featurebyte.com/api/v1 using SDK v3.4.2. What would you like to do next?
> I would like to create a new catalog on staging to forecast the next 28 days
of sales at the department store level.
Claude asks clarifying questions before taking action: catalog name, feature store, data tables, timezone, and feature job schedule.
2. Register Tables¶
After the user points Claude at DEPT_STORE_SALES_AMOUNT, it analyzes the schema, recognizes it as a classic time-series setup, and asks for timezone and job cadence. The user asks it to explore additional tables in the schema:
| Table | Rows | Type | Description |
|---|---|---|---|
CALENDAR |
5,928 | CalendarTable | Events/holidays by state and day |
DEPT_STORE |
70 | DimensionTable | Maps dept_store_id → dept, store, state |
DEPT_CAT |
7 | DimensionTable | Maps department → category |
STORE_STATE |
10 | DimensionTable | Maps store → state |
FOOD_STAMPS |
41,496 | CalendarTable | Daily food-stamp eligibility by dept-state |
Claude registered six tables and set up the entity relationships (Department Store → Department, Store, Department State; Department → Category; Store → State).
3. Set up Context, Target, and Use Case¶
Claude creates the context, then a regression target daily_sales_amount (fill value 0), and links them via a use case.
4. Create Observation Tables¶
> Let's create 2 observation tables. Training and EDA with weekly prediction on
Monday at 3.30am with forecast points the next 28 days including the first
Monday. Starts from 2012-02-18 and ends at 2016-04-25. Keep EDA small at 50K.
| Table | Rows | Purpose |
|---|---|---|
| Training | 106,960 | Training |
| EDA | 50,000 | EDA (assigned to use case) |
| Validation | 1,960 | Validation/Test |
The validation table was built from a user-provided CSV of dept_store_id values with FORECAST_POINTs covering the 28 days after the training period.
5. Run Feature Ideation¶
After suggesting entity assignments and accepting the user's tweaks (adding Department and Store to the sales table), Claude kicks off the ideation pipeline. About an hour later, the pipeline returns 510 features across 34 themes. Claude then reads the ideation report and summarizes it:
- Top predictive signals: ratio features comparing recent sales to the 182-day baseline, conditioned on food-stamp × weekday, forecast weekday, and department-level trends.
- Best model: LightGBM refined to 34 features, R² 0.9747, MAE 180.06.
- Dominant feature: 182-day average sales (naive baseline) at ~74% importance.
6. Explore Unselected Features¶
Claude surfaces recency features (~0.011 score, not selected), short-window timing features, and similarity features. The user selects recency features for inclusion. Claude saves them and builds an expanded feature list of 40 features.
7. Train, Compare, Iterate¶
Each feature list is trained and scored on the validation set. Claude tracks the cumulative improvement across iterations:
| Metric | 34 feat (refined) | 40 (+recency) | 43 (+share) | 46 (+weekend) |
|---|---|---|---|---|
| R² | 0.9747 | 0.9749 | 0.9753 | 0.9755 |
| MAE | 181.23 | 179.48 | 179.24 | 179.07 |
| Median AE | 118.60 | 117.32 | 118.03 | 115.33 |
| Poisson Deviance | 29.02 | 28.83 | 28.67 | 28.52 |
| RMSE | 272.15 | 270.83 | 268.70 | 267.78 |
8. Build Custom Features from Ideas¶
When the user asks "any new feature ideas?", Claude proposes share-of-store, year-over-year comparisons, specific lag offsets, event proximity, and cross-department correlations, then checks which are already covered by ideation so effort is focused on genuinely new signal:
- Share-of-store: genuinely new (ideation only has cosine-similarity features, not raw share ratios).
- Weekend × Food-stamp: not in ideation, though Weekday × Food-stamp was a top scorer.
Claude then writes the Python, saves the features to the catalog, and trains the model. The Median AE drops from 118.60 to 115.33 (-2.8%) after these additions.
Why This Works¶
The example illustrates the main advantages of the skill:
- Conversational workflow. No boilerplate for catalog setup, context creation, observation tables, or task polling.
- Schema-aware suggestions. Claude reads table contents before proposing entity assignments, relationships, and feature ideas.
- FeatureByte sets a high floor. The ideation pipeline delivers a Kaggle-grandmaster baseline out of the box, competitive with the best Tabular Foundation Models, with every ideated feature's SDK code visible for inspection. Claude builds on that foundation rather than starting from zero.
- Claude translates intuition into features. When a data scientist says "I wonder if share-of-store matters" or "the weekend version of this interaction is probably different", Claude turns those half-formed hypotheses into working FeatureByte SDK code. The domain expert doesn't need to know the declarative syntax, the aggregation primitives, or the category-groupby pattern; they need to know their data. The resulting feature is point-in-time correct and catalog-registered like any other.
- Report comprehension. Claude reads ideation reports and summarises what mattered, not just what ran.
- Feature gap analysis. Before building new features, Claude checks whether the ideation pipeline already covered them, so effort lands on genuinely new signal.
- FeatureByte catalogs everything. Every asset the conversation produces (context, target, use case, observation tables, ideated features, trained models) lands in the FeatureByte UI as a first-class, versioned object. The session is not ephemeral chat output; it is a governed catalog that teammates can inspect, extend, and deploy from.
- Iterative model comparison. Each feature batch is trained and compared against prior runs in a single table.
Further Reading¶
- Claude Code documentation: installing and configuring the CLI
- API Overview: the REST API that powers the skill's orchestration calls