Skip to content

ViewColumn

A ViewColum object represents a column in a View object, which can undergo various transformations depending on its data type.

Transforming ViewColumn

Generic Transforms

Generic transforms can be applied to any data type within a view. Find a list of generic transforms in the glossary.

Numeric Transforms

Built-in arithmetic operators (+, -, *, /) can be used to manipulate numeric columns, as shown below:

# Get the view directly from the catalog
invoice_view = catalog.get_view("GROCERYINVOICE")
# Add a column that measures the Discount ratio
items_view["Discount_ratio"] = items_view["Discount"] / items_view["TotalCost"]

Refer to the glossary for additional numeric transforms.

String Transforms

String columns can be concatenated directly:

# Add a column that concatenates two string columns
toy_example = (
    items_view["Quantity"].astype("str") + " * " +
    items_view["GroceryProductGuid"]
)

Check the glossary for other String transforms.

Datetime Transforms

Datetime columns can be transformed in various ways, such as calculating differences, adding time intervals, or extracting date components. The glossary provides a list of supported dateparts transforms.

Lag Transforms

Retrieve preceding values associated with a specific entity, using the lag() method:

ts_col = invoice_view[invoice_view.timestamp_column]
invoice_view["InterEventTime"] = (
    ts_col - ts_col.lag("GroceryCustomerGuid")
).dt.day

Note

Lags transforms are only supported for Event and Change views.

UDF Transforms

List user-defined functions available in the catalog using the list_user_defined_functions() method:

catalog.list_user_defined_functions()

Retrieve a UDF instance from the catalog using the get_user_defined_function() method:

embedding_udf = catalog.get_user_defined_function("embedding")

Get the user-defined function's metadata using the info() method:

embedding_udf.info()

Execute the function on any view column or feature with a compatible data type:

product_view = catalog.get_view("GROCERYPRODUCT")
product_view["TransactionDescription_Embedding"] = embedding_udf(
    product_view["TransactionDescription"]
)

Conditional Transforms

Apply if-then-else transforms by filtering rows through conditional statements.

# Flag items as discounted, free, or undiscounted
discounted_filter = items_view["Discount"] > 0
free_filter = items_view["TotalCost"] == 0

items_view["DiscountCategory"] = "Undiscounted"
items_view.DiscountCategory[discounted_filter] = "Discounted"
items_view.DiscountCategory[free_filter] = "Free"

Exploring ViewColumn

Perform exploratory analysis on view columns, such as descriptive statistics, preview, and sample. These operations are executed after cleaning operations set at the table level or during view creation if created in manual mode.

# Obtain descriptive statistics for a view column
invoice_view.Amount.describe()
# Preview a selection of rows from the view column
df = invoice_view.Amount.preview(limit=20)
# Sample random rows from the view column based on a specified time range, size, and seed
df = invoice_view.sample(
    from_timestamp=pd.Timestamp('2023-04-01'),
    to_timestamp=pd.Timestamp('2023-05-01'), size=100, seed=23
)

Copying ViewColumn

When extracting a ViewColumn from a View object, a copy of the original object is created. This can be assigned back to the view as a new column or used for further transformations.

# Extract amount
amount = invoice_view[Amount]
amount.fillna(0)
# Assign the transformed amount Series into the view as a new column
invoice_view[cleaned_amount] = amount

Defining ViewColumn as a Feature

Directly use a view column as a Lookup feature for the entity identified by the view's primary key or the natural key of an SCDView object by using the as_feature() method:

# Create a feature from the OperatingSystemIsWindows column
uses_windows = customer_view.OperatingSystemIsWindows.as_feature("UsesWindows")
uses_windows.save()