### (Optional) Updating descriptions to tables and columns 

Table and column descriptions are automatically fetched from your Data Warehouse when they are available. If these descriptions are missing or incomplete, you have the option to edit and update them.

While not mandatory, updating concise descriptions to tables and columns can be immensely beneficial if you are using FeatureByte Enterprise.
These annotations assist FeatureByte's feature ideation engine in generating insightful features.

Much like a data scientist, FeatureByte does its best to grasp the significance and purpose of various tables and columns, discerning their types and more. Based on this understanding, it suggests pertinent aggregations and feature combinations. While FeatureByte can operate effectively without these descriptions, having them certainly enhances the quality of its recommendations.

In [1]:
import featurebyte as fb

# Set your profile to the tutorial environment
fb.use_profile("tutorial")

catalog_name = "Grocery Dataset Tutorial"
catalog = fb.Catalog.activate(catalog_name)  



[32;20m16:05:56[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mUsing profile: tutorial[0m[0m


[32;20m16:05:56[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mUsing configuration file at: /Users/gxav/.featurebyte/config.yaml[0m[0m


[32;20m16:05:56[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mActive profile: tutorial (https://tutorials.featurebyte.com/api/v1)[0m[0m




[32;20m16:05:56[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mNo catalog activated.[0m[0m


[32;20m16:05:56[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mCatalog activated: Grocery Dataset Tutorial[0m[0m


Get tables from the catalog first:

In [2]:
catalog.list_tables()

Unnamed: 0,id,name,type,status,entities,created_at
0,666956c78080c62d0dc616e2,GROCERYPRODUCT,dimension_table,PUBLIC_DRAFT,"[product, productgroup]",2024-06-12T08:05:27.992000
1,666956c58080c62d0dc616e1,INVOICEITEMS,item_table,PUBLIC_DRAFT,"[item, invoice, product]",2024-06-12T08:05:26.172000
2,666956c38080c62d0dc616e0,GROCERYINVOICE,event_table,PUBLIC_DRAFT,"[invoice, customer]",2024-06-12T08:05:24.205000
3,666956c28080c62d0dc616df,GROCERYCUSTOMER,scd_table,PUBLIC_DRAFT,"[customer, frenchstate]",2024-06-12T08:05:22.270000


In [3]:
customer_table = catalog.get_table("GROCERYCUSTOMER")
invoice_table = catalog.get_table("GROCERYINVOICE")
items_table = catalog.get_table("INVOICEITEMS")
product_table = catalog.get_table("GROCERYPRODUCT")

#### Let's discover the current descriptions of tables:

In [4]:
customer_table.description

'Customer details, including their name, address, and date of birth.'

In [5]:
invoice_table.description

'Grocery invoice details, containing the timestamp and the total amount of the invoice.'

In [6]:
items_table.description

'The grocery item details within each invoice, including the quantity, total cost, discount applied, and product ID.'

In [7]:
product_table.description

'The product group description for each grocery product.'

#### Let's update descriptions of one table:

In [8]:
customer_table.update_description('Customer details')

In [9]:
customer_table.description

'Customer details'

In [10]:
customer_table.update_description('Customer details, including their name, address, and date of birth.')

#### Let's discover the current descriptions of columns for each table:

You can either display all columns together

In [11]:
import pandas as pd
pd.DataFrame(customer_table.info(verbose=True)["columns_info"])

Unnamed: 0,name,dtype,entity,semantic,critical_data_info,description
0,RowID,VARCHAR,,scd_surrogate_key_id,,"Unique identifier of each row in the table, in..."
1,GroceryCustomerGuid,VARCHAR,customer,scd_natural_key_id,,"Unique identifier for each customer, in GUID f..."
2,ValidFrom,TIMESTAMP,,scd_effective_timestamp,,GMT timestamp of when this version of a custom...
3,Gender,VARCHAR,,,,The customer's gender. Can only have values ma...
4,Title,VARCHAR,,,,The customer's title. Can only have values Dr....
5,GivenName,VARCHAR,,,,The customer's given name.
6,MiddleInitial,VARCHAR,,,,The first letter of the customer's middle name.
7,Surname,VARCHAR,,,,The customer's family or surname.
8,StreetAddress,VARCHAR,,,,The customer address's building number and str...
9,City,VARCHAR,,,,The city name of the customer's address.


Or display each column one by one

In [12]:
for column in customer_table.columns:
    print(f"{column}: {customer_table[column].description}")

RowID: Unique identifier of each row in the table, in GUID format. Uniquely identifies each customer and version combination.
GroceryCustomerGuid: Unique identifier for each customer, in GUID format.
ValidFrom: GMT timestamp of when this version of a customer's details becomes valid or live.
Gender: The customer's gender. Can only have values male or female.
Title: The customer's title. Can only have values Dr. Mr. Mrs. or Ms.
GivenName: The customer's given name.
MiddleInitial: The first letter of the customer's middle name.
Surname: The customer's family or surname.
StreetAddress: The customer address's building number and street name.
City: The city name of the customer's address.
State: The state name of the customer's address.
PostalCode: The postal code of the customer's address. Contains only digits but is a categorical variable.
BrowserUserAgent: The user agent details of the customer's internet browser, including the operating system name and version, and the browser name and 

If the description is incorrect or incomplete, you can edit it

In [13]:
# By using the table method: update_column_description
customer_table.update_column_description(
    "RowID",
    "Unique identifier of each row in the table, in GUID format. Uniquely identifies each customer and version combination."
)

In [14]:
# Or by using the column method: update_description
customer_table.RowID.update_description(
    "Unique identifier of each row in the table, in GUID format. Uniquely identifies each customer and version combination."
)

That's it for this tutorial. 
Again, this is an optional step, but it can drastically improve FeatureByte's feature ideation.

#### SDK reference for
- [Table.info](https://docs.featurebyte.com/latest/reference/featurebyte.api.table.Table.info/)