Playground: Credit Cards¶
The notebook creates a fresh catalog using the credit card dataset. You can use it to practice feature engineering.
Load the featurebyte library and connect to the local instance of featurebyte¶
In [1]:
Copied!
# library imports
import pandas as pd
import numpy as np
# load the featurebyte SDK
import featurebyte as fb
# start the local server, then wait for it to be healthy before proceeding
fb.playground()
# library imports
import pandas as pd
import numpy as np
# load the featurebyte SDK
import featurebyte as fb
# start the local server, then wait for it to be healthy before proceeding
fb.playground()
02:07:16 | INFO | Using configuration file at: /home/chester/.featurebyte/config.yaml 02:07:16 | INFO | Active profile: local (http://127.0.0.1:8088) 02:07:16 | INFO | SDK version: 0.2.2 02:07:16 | INFO | Active catalog: default 02:07:16 | INFO | 0 feature list, 0 feature deployed 02:07:16 | INFO | (1/4) Starting featurebyte services Container mongo-rs Running Container spark-thrift Running Container redis Running Container featurebyte-server Running Container featurebyte-worker Running Container redis Waiting Container mongo-rs Waiting Container mongo-rs Waiting Container redis Healthy Container mongo-rs Healthy Container mongo-rs Healthy 02:07:17 | INFO | (2/4) Creating local spark feature store 02:07:17 | INFO | (3/4) Import datasets 02:07:18 | INFO | Dataset grocery already exists, skipping import 02:07:18 | INFO | Dataset healthcare already exists, skipping import 02:07:18 | INFO | Dataset creditcard already exists, skipping import 02:07:18 | INFO | (4/4) Playground environment started successfully. Ready to go! 🚀
Create a pre-built catalog for this tutorial, with the data, metadata, and features already set up¶
Note that creating a pre-built catalog is not a step you will do in real-life. This is a function specific to this quick-start tutorial to quickly skip over many of the preparatory steps and get you to a point where you can materialize features.
In a real-life project you would do data modeling, declaring the tables, entities, and the associated metadata. This would not be a frequent task, but forms the basis for best-practice feature engineering.
Load the featurebyte library and connect to the local instance of featurebyte¶
In [2]:
Copied!
# get the functions to create a pre-built catalog
from prebuilt_catalogs import *
# create a new catalog for this tutorial
catalog = create_tutorial_catalog(PrebuiltCatalog.Playground_CreditCard)
# get the functions to create a pre-built catalog
from prebuilt_catalogs import *
# create a new catalog for this tutorial
catalog = create_tutorial_catalog(PrebuiltCatalog.Playground_CreditCard)
Cleaning up existing tutorial catalogs
02:07:19 | INFO | Catalog activated: deep dive materializing features 20230511:0202
Cleaning catalog: deep dive materializing features 20230511:0202 1 batch feature tables 1 batch request tables 1 historical feature tables 2 observation tables Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.1s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.0s (0.17%/s) Done! |████████████████████████████████████████| 100% in 6.0s (0.17%/s)
02:07:50 | INFO | Catalog activated: default 02:07:50 | INFO | Catalog activated: credit card playground 20230511:0207
Building a playground catalog for credit cards named [credit card playground 20230511:0207] Creating new catalog Catalog created Registering the source tables Registering the entities Tagging the entities to columns in the data tables ################################################################## # suggested script to load the tables and views into your notebook # get the table objects cardtransactiongroups_table = catalog.get_table("CARDTRANSACTIONGROUPS") cardfraudstatus_table = catalog.get_table("CARDFRAUDSTATUS") cardtransactions_table = catalog.get_table("CARDTRANSACTIONS") creditcard_table = catalog.get_table("CREDITCARD") statedetails_table = catalog.get_table("STATEDETAILS") bankcustomer_table = catalog.get_table("BANKCUSTOMER") # get the view objects cardtransactiongroups_view = cardtransactiongroups_table.get_view() cardfraudstatus_view = cardfraudstatus_table.get_view() cardtransactions_view = cardtransactions_table.get_view() creditcard_view = creditcard_table.get_view() statedetails_view = statedetails_table.get_view() bankcustomer_view = bankcustomer_table.get_view() ##################################################################