{ "cells": [ { "cell_type": "markdown", "id": "0e8be874", "metadata": {}, "source": [ "### Create a feature list \n", "\n", "Feature list is a collection of features which can be used for machine learning model.\n", "\n", "Let's take some of features we created and bundle them into a feature list." ] }, { "cell_type": "code", "execution_count": 1, "id": "f1bad689-fd55-4606-9430-639fa8706f70", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32;20m16:44:15\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mUsing configuration file at: /Users/viktor/.featurebyte/config.yaml\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:15\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mActive profile: tutorial (https://tutorials.featurebyte.com/api/v1)\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:15\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mSDK version: 0.6.0.dev121\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:15\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mNo catalog activated.\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20m10 feature lists, 59 features deployed\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mUsing profile: tutorial\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mUsing configuration file at: /Users/viktor/.featurebyte/config.yaml\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mActive profile: tutorial (https://tutorials.featurebyte.com/api/v1)\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mSDK version: 0.6.0.dev121\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mNo catalog activated.\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:16\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20m10 feature lists, 59 features deployed\u001b[0m\u001b[0m\n", "\u001b[32;20m16:44:17\u001b[0m | \u001b[1m\u001b[38;20mINFO \u001b[0m\u001b[0m | \u001b[1m\u001b[38;20mCatalog activated: Grocery Dataset Tutorial\u001b[0m\u001b[0m\n" ] } ], "source": [ "import featurebyte as fb\n", "\n", "# Set your profile to the tutorial environment\n", "fb.use_profile(\"tutorial\")\n", "\n", "catalog_name = \"Grocery Dataset Tutorial\"\n", "catalog = fb.Catalog.activate(catalog_name) " ] }, { "cell_type": "markdown", "id": "5423e3fe-0b7b-4312-b5ef-996ebdb3eb76", "metadata": {}, "source": [ "#### List all features we created so far" ] }, { "cell_type": "code", "execution_count": 2, "id": "035b48c3-0ec4-4481-b5c1-90e955da4db7", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamedtypereadinessonline_enabledtablesprimary_tablesentitiesprimary_entitiescreated_at
06564b93614db530858940f50CUSTOMER_vs_OVERALL_item_TotalCost_across_prod...FLOATDRAFTFalse[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT][INVOICEITEMS][customer][customer]2023-11-27T15:44:00.351000
16564b91c69ee318ce739930cCUSTOMER_Latest_invoice_Amount_Z_Score_to_invo...FLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:43:32.311000
26564b8f0a69f1a9a43bdfd85CUSTOMER_Std_of_invoice_Amount_28dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:56.115000
36564b8f0a69f1a9a43bdfd84CUSTOMER_Std_of_invoice_Amount_14dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:54.994000
46564b8f0a69f1a9a43bdfd83CUSTOMER_Avg_of_invoice_Amount_28dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:53.425000
56564b8f0a69f1a9a43bdfd82CUSTOMER_Avg_of_invoice_Amount_14dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:52.201000
66564b8f0a69f1a9a43bdfd80CUSTOMER_Count_of_invoice_28dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:51.056000
76564b8f0a69f1a9a43bdfd7eCUSTOMER_Count_of_invoice_14dFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:50.243000
86564b8f0a69f1a9a43bdfd7dCUSTOMER_Latest_invoice_AmountFLOATDRAFTFalse[GROCERYINVOICE][GROCERYINVOICE][customer][customer]2023-11-27T15:42:49.486000
96564b8c8e991187ee84de74eCUSTOMER_Age_bandVARCHARDRAFTFalse[GROCERYCUSTOMER][GROCERYCUSTOMER][customer][customer]2023-11-27T15:42:12.722000
106564b8c8e991187ee84de744CUSTOMER_AgeINTDRAFTFalse[GROCERYCUSTOMER][GROCERYCUSTOMER][customer][customer]2023-11-27T15:42:05.100000
\n", "
" ], "text/plain": [ " id \\\n", "0 6564b93614db530858940f50 \n", "1 6564b91c69ee318ce739930c \n", "2 6564b8f0a69f1a9a43bdfd85 \n", "3 6564b8f0a69f1a9a43bdfd84 \n", "4 6564b8f0a69f1a9a43bdfd83 \n", "5 6564b8f0a69f1a9a43bdfd82 \n", "6 6564b8f0a69f1a9a43bdfd80 \n", "7 6564b8f0a69f1a9a43bdfd7e \n", "8 6564b8f0a69f1a9a43bdfd7d \n", "9 6564b8c8e991187ee84de74e \n", "10 6564b8c8e991187ee84de744 \n", "\n", " name dtype readiness \\\n", "0 CUSTOMER_vs_OVERALL_item_TotalCost_across_prod... FLOAT DRAFT \n", "1 CUSTOMER_Latest_invoice_Amount_Z_Score_to_invo... FLOAT DRAFT \n", "2 CUSTOMER_Std_of_invoice_Amount_28d FLOAT DRAFT \n", "3 CUSTOMER_Std_of_invoice_Amount_14d FLOAT DRAFT \n", "4 CUSTOMER_Avg_of_invoice_Amount_28d FLOAT DRAFT \n", "5 CUSTOMER_Avg_of_invoice_Amount_14d FLOAT DRAFT \n", "6 CUSTOMER_Count_of_invoice_28d FLOAT DRAFT \n", "7 CUSTOMER_Count_of_invoice_14d FLOAT DRAFT \n", "8 CUSTOMER_Latest_invoice_Amount FLOAT DRAFT \n", "9 CUSTOMER_Age_band VARCHAR DRAFT \n", "10 CUSTOMER_Age INT DRAFT \n", "\n", " online_enabled tables \\\n", "0 False [GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT] \n", "1 False [GROCERYINVOICE] \n", "2 False [GROCERYINVOICE] \n", "3 False [GROCERYINVOICE] \n", "4 False [GROCERYINVOICE] \n", "5 False [GROCERYINVOICE] \n", "6 False [GROCERYINVOICE] \n", "7 False [GROCERYINVOICE] \n", "8 False [GROCERYINVOICE] \n", "9 False [GROCERYCUSTOMER] \n", "10 False [GROCERYCUSTOMER] \n", "\n", " primary_tables entities primary_entities created_at \n", "0 [INVOICEITEMS] [customer] [customer] 2023-11-27T15:44:00.351000 \n", "1 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:43:32.311000 \n", "2 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:56.115000 \n", "3 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:54.994000 \n", "4 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:53.425000 \n", "5 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:52.201000 \n", "6 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:51.056000 \n", "7 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:50.243000 \n", "8 [GROCERYINVOICE] [customer] [customer] 2023-11-27T15:42:49.486000 \n", "9 [GROCERYCUSTOMER] [customer] [customer] 2023-11-27T15:42:12.722000 \n", "10 [GROCERYCUSTOMER] [customer] [customer] 2023-11-27T15:42:05.100000 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog.list_features()" ] }, { "cell_type": "markdown", "id": "d41b6c22-c887-47d4-a263-b81ab9590dd4", "metadata": {}, "source": [ "#### Get features from catalog" ] }, { "cell_type": "code", "execution_count": 3, "id": "452d8ff1-d0cb-4e53-90bf-3b75ffffbfff", "metadata": { "tags": [] }, "outputs": [], "source": [ "customer_age_band = catalog.get_feature(\"CUSTOMER_Age_band\")\n", "customer_latest_invoice_amount = catalog.get_feature(\"CUSTOMER_Latest_invoice_Amount\")\n", "customer_count_of_invoice_14d = catalog.get_feature(\"CUSTOMER_Count_of_invoice_14d\")\n", "customer_avg_of_invoice_amount_14d = catalog.get_feature(\"CUSTOMER_Avg_of_invoice_Amount_14d\")\n", "customer_std_of_invoice_amount_14d = catalog.get_feature(\"CUSTOMER_Std_of_invoice_Amount_14d\")\n", "customer_latest_invoice_amount_Z_score_to_invoice_amount_28d = catalog.get_feature(\n", " \"CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d\"\n", ")\n", "customer_vs_overall_item_totalcost_across_product_productgroups_26w = catalog.get_feature(\n", " \"CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w\"\n", ")" ] }, { "cell_type": "markdown", "id": "4738e713-d74d-4c92-b988-16eec38d5f60", "metadata": {}, "source": [ "#### Create feature list" ] }, { "cell_type": "code", "execution_count": 4, "id": "790ccf39-b8d1-494f-833b-fc061e6eb623", "metadata": { "tags": [] }, "outputs": [], "source": [ "simple_feature_list = fb.FeatureList(\n", " [\n", " customer_age_band,\n", " customer_latest_invoice_amount,\n", " customer_count_of_invoice_14d,\n", " customer_avg_of_invoice_amount_14d,\n", " customer_std_of_invoice_amount_14d,\n", " customer_latest_invoice_amount_Z_score_to_invoice_amount_28d,\n", " customer_vs_overall_item_totalcost_across_product_productgroups_26w\n", " ],\n", " name=\"Customer Simple FeatureList\"\n", ")" ] }, { "cell_type": "markdown", "id": "8d82e474-ffef-4dd4-a58e-9bb387bb8dac", "metadata": {}, "source": [ "#### Preview feature list\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "b3591d77-6b4e-4b5d-871d-d345905e2d34", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[\n", " {\n", " 'name': 'customer',\n", " 'created_at': '2023-11-27T15:39:09.477000',\n", " 'updated_at': '2023-11-27T15:39:19.968000',\n", " 'description': None,\n", " 'serving_names': [\n", " 'GROCERYCUSTOMERGUID'\n", " ],\n", " 'catalog_name': 'Grocery Dataset Tutorial'\n", " }]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the primary entity of the feature list\n", "simple_feature_list.primary_entity" ] }, { "cell_type": "code", "execution_count": 6, "id": "91ba26b1-7009-4094-9c1e-84afbeee027c", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading table |████████████████████████████████████████| 10/10 [100%] in 0.1\n" ] } ], "source": [ "# Get observation table: 'Preview Table with 10 Customers'\n", "preview_table = catalog.get_observation_table(\n", " \"Preview Table with 10 Customers\"\n", ").to_pandas()" ] }, { "cell_type": "code", "execution_count": 7, "id": "77b90735-a996-49c8-b5fe-7a5b8394aa75", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
POINT_IN_TIMEGROCERYCUSTOMERGUIDCUSTOMER_Age_bandCUSTOMER_Latest_invoice_AmountCUSTOMER_Count_of_invoice_14dCUSTOMER_Avg_of_invoice_Amount_14dCUSTOMER_Std_of_invoice_Amount_14dCUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28dCUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w
02022-11-28 11:36:31d4559f7d-eb28-42c6-b47d-847de24952c265-696.720NaNNaNNaN0.466220
12022-10-09 15:47:553f8c7c4c-f2c2-408e-a08e-622de3d3a0b955-5912.280NaNNaNNaN0.558564
22022-09-14 15:42:4235390325-8443-43c1-a934-18db923d9a4715-1910.020NaNNaN-0.8380570.872762
32022-12-26 18:39:464eb4ee84-ee13-4eec-9c26-61b6eb4ba35b70-7453.09515.62600019.0008090.9926530.713610
42022-12-06 08:47:43e42fa5f3-7737-4c6a-9ef4-856f113e60bd20-2421.74315.5600006.6813371.5306440.657608
52022-11-09 12:14:408440debb-6abc-4adc-8c6c-749928141fd020-2415.30115.3000000.000000NaN0.509610
62022-10-12 17:32:158a54e527-e9a4-47a9-a28f-8b3c6ecc02db35-3914.55214.5600000.010000-0.0494590.753250
72023-01-01 11:51:28cea213d4-36e4-48c3-ae8d-c7a25911e11c85-890.89123.2091674.154236-0.5304700.561260
82023-02-05 15:48:233b4f2821-b761-40e9-a32a-5f09685cc59780-8411.43411.3175004.5262970.0248550.761539
92023-03-10 16:15:4691a64566-e212-4e36-8f23-c1f1f324a30150-542.0064.5683332.832763-0.7437740.729546
\n", "
" ], "text/plain": [ " POINT_IN_TIME GROCERYCUSTOMERGUID CUSTOMER_Age_band \\\n", "0 2022-11-28 11:36:31 d4559f7d-eb28-42c6-b47d-847de24952c2 65-69 \n", "1 2022-10-09 15:47:55 3f8c7c4c-f2c2-408e-a08e-622de3d3a0b9 55-59 \n", "2 2022-09-14 15:42:42 35390325-8443-43c1-a934-18db923d9a47 15-19 \n", "3 2022-12-26 18:39:46 4eb4ee84-ee13-4eec-9c26-61b6eb4ba35b 70-74 \n", "4 2022-12-06 08:47:43 e42fa5f3-7737-4c6a-9ef4-856f113e60bd 20-24 \n", "5 2022-11-09 12:14:40 8440debb-6abc-4adc-8c6c-749928141fd0 20-24 \n", "6 2022-10-12 17:32:15 8a54e527-e9a4-47a9-a28f-8b3c6ecc02db 35-39 \n", "7 2023-01-01 11:51:28 cea213d4-36e4-48c3-ae8d-c7a25911e11c 85-89 \n", "8 2023-02-05 15:48:23 3b4f2821-b761-40e9-a32a-5f09685cc597 80-84 \n", "9 2023-03-10 16:15:46 91a64566-e212-4e36-8f23-c1f1f324a301 50-54 \n", "\n", " CUSTOMER_Latest_invoice_Amount CUSTOMER_Count_of_invoice_14d \\\n", "0 6.72 0 \n", "1 12.28 0 \n", "2 10.02 0 \n", "3 53.09 5 \n", "4 21.74 3 \n", "5 15.30 1 \n", "6 14.55 2 \n", "7 0.89 12 \n", "8 11.43 4 \n", "9 2.00 6 \n", "\n", " CUSTOMER_Avg_of_invoice_Amount_14d CUSTOMER_Std_of_invoice_Amount_14d \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 15.626000 19.000809 \n", "4 15.560000 6.681337 \n", "5 15.300000 0.000000 \n", "6 14.560000 0.010000 \n", "7 3.209167 4.154236 \n", "8 11.317500 4.526297 \n", "9 4.568333 2.832763 \n", "\n", " CUSTOMER_Latest_invoice_Amount_Z_Score_to_invoice_Amount_28d \\\n", "0 NaN \n", "1 NaN \n", "2 -0.838057 \n", "3 0.992653 \n", "4 1.530644 \n", "5 NaN \n", "6 -0.049459 \n", "7 -0.530470 \n", "8 0.024855 \n", "9 -0.743774 \n", "\n", " CUSTOMER_vs_OVERALL_item_TotalCost_across_product_ProductGroups_26w \n", "0 0.466220 \n", "1 0.558564 \n", "2 0.872762 \n", "3 0.713610 \n", "4 0.657608 \n", "5 0.509610 \n", "6 0.753250 \n", "7 0.561260 \n", "8 0.761539 \n", "9 0.729546 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Preview simple_feature_list\n", "simple_feature_list.preview(preview_table)" ] }, { "cell_type": "markdown", "id": "feadc391-142f-4ce7-ba88-e6ff6ce2ed4d", "metadata": {}, "source": [ "#### Save feature list " ] }, { "cell_type": "code", "execution_count": 8, "id": "32bc2b8f-74ec-4d19-99c5-16b02e0890b2", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s) \n", "Loading Feature(s) |████████████████████████████████████████| 7/7 [100%] in 0.7s\n" ] } ], "source": [ "# Save feature list\n", "simple_feature_list.save()\n", "# Add description\n", "simple_feature_list.update_description(\"Simple feature list for the customer\")" ] }, { "cell_type": "markdown", "id": "354b8740-717c-4314-9cca-2a8b45775129", "metadata": {}, "source": [ "### Concepts in this tutorial\n", "- [More on feature lists](https://docs.featurebyte.com/latest/about/glossary/#feature-list-creation)\n", "\n", "#### SDK Reference for\n", "- [Feature List](https://docs.featurebyte.com/latest/reference/core/feature_list/)\n", "- [FeatureList.save()](https://docs.featurebyte.com/latest/reference/featurebyte.api.feature_list.FeatureList.save/)" ] }, { "cell_type": "code", "execution_count": null, "id": "42ccd220-fcd6-41fb-9ca0-243519be8269", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }